smcproject
/

Malwhisper-v1-small

Automatic Speech Recognition

Model card Files Files and versions Community

kurianbenoy commited on Mar 17, 2024

Commit

d67d5c5

·

verified ·

1 Parent(s): b8b7fc1

Update README.md

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -27,8 +27,18 @@ library_name: transformers
 This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) fine-tuned on [IMASc dataset](https://www.kaggle.com/datasets/thennal/imasc).
 IMaSC is a Malayalam text and speech corpus made available by ICFOSS for the purpose of developing speech technology for Malayalam, particularly text-to-speech. The corpus contains 34,473 text-audio pairs of Malayalam sentences spoken by 8 speakers, totalling in approximately 50 hours of audio.
 The fine-tuned model on evaluating in the following dataset:
 **In Mozilla CommonVoice 11.0 dataset (Malayalam subset):**

 This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) fine-tuned on [IMASc dataset](https://www.kaggle.com/datasets/thennal/imasc).
+## About Dataset
 IMaSC is a Malayalam text and speech corpus made available by ICFOSS for the purpose of developing speech technology for Malayalam, particularly text-to-speech. The corpus contains 34,473 text-audio pairs of Malayalam sentences spoken by 8 speakers, totalling in approximately 50 hours of audio.
+## Training
+- GPUs used: T4 - 16 GB
+- Training Time: 14 hours
+## Evaluation
 The fine-tuned model on evaluating in the following dataset:
 **In Mozilla CommonVoice 11.0 dataset (Malayalam subset):**