smcproject
/

Malwhisper-v1-small

Automatic Speech Recognition

Model card Files Files and versions Community

kurianbenoy commited on Jan 16, 2024

Commit

41ee953

·

verified ·

1 Parent(s): e2a480f

Update README.md

Files changed (1) hide show

README.md +26 -3

README.md CHANGED Viewed

@@ -4,16 +4,39 @@ datasets:
 - thennal/IMaSC
 language:
 - ml
 ---
-On Evaluating the model:
-In Mozilla CommonVoice dataset:
 WER - 24.83
 CER - 12.84
-In SMC dataset:
 WER - 27.28
 CER - 14.64

 - thennal/IMaSC
 language:
 - ml
+model-index:
+- name: Malwhisper-v1-small - Kurian Benoy
+  results:
+  - task:
+      type: automatic-speech-recognition
+      name: Automatic Speech Recognition
+    dataset:
+      name: Common Voice 11.0
+      type: mozilla-foundation/common_voice_11_0
+      config: ml
+      split: test
+      args: ml
+    metrics:
+    - type: wer
+      value: 24.83
+      name: WER
+library_name: transformers
 ---
+## kurianbenoy/Malwhisper-v1-small
+This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) fine-tuned on [IMASc dataset](https://www.kaggle.com/datasets/thennal/imasc).
+IMaSC is a Malayalam text and speech corpus made available by ICFOSS for the purpose of developing speech technology for Malayalam, particularly text-to-speech. The corpus contains 34,473 text-audio pairs of Malayalam sentences spoken by 8 speakers, totalling in approximately 50 hours of audio.
+The fine-tuned model on evaluating in the following dataset:
+**In Mozilla CommonVoice 11.0 dataset (Malayalam subset):**
 WER - 24.83
 CER - 12.84
+**In SMC Malayalam Speech Corpus dataset:**
 WER - 27.28
 CER - 14.64