fitlemon
/

whisper-small-uz-en-ru-lang-id

Audio Classification

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

fitlemon commited on Mar 7, 2024

Commit

4740922

·

verified ·

1 Parent(s): 9030bb4

Update README.md

Files changed (1) hide show

README.md +25 -4

README.md CHANGED Viewed

@@ -9,6 +9,13 @@ metrics:
 model-index:
 - name: whisper-small-uz-en-ru-lang-id
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -16,7 +23,7 @@ should probably proofread and complete it, then remove this comment. -->
 # whisper-small-uz-en-ru-lang-id
-This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the None dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.2065
 - Accuracy: 0.9747
@@ -31,10 +38,24 @@ More information needed
 More information needed
 ## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
@@ -65,4 +86,4 @@ The following hyperparameters were used during training:
 - Transformers 4.38.2
 - Pytorch 2.2.1+cu121
 - Datasets 2.17.1
-- Tokenizers 0.15.2

 model-index:
 - name: whisper-small-uz-en-ru-lang-id
   results: []
+datasets:
+- mozilla-foundation/common_voice_16_1
+language:
+- uz
+- en
+- ru
+pipeline_tag: audio-classification
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # whisper-small-uz-en-ru-lang-id
+This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the "mozilla-foundation/common_voice_16_1"(uz/en/ru) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.2065
 - Accuracy: 0.9747
 More information needed
 ## Training and evaluation data
+```
+# datasets for each lang-id
+common_voice_train_uz = load_dataset("mozilla-foundation/common_voice_16_1", "uz", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
+common_voice_train_ru = load_dataset("mozilla-foundation/common_voice_16_1", "ru", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
+common_voice_train_en = load_dataset("mozilla-foundation/common_voice_16_1", "en", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
+common_voice_valid_uz = load_dataset("mozilla-foundation/common_voice_16_1", "uz", split='validation', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
+common_voice_valid_ru = load_dataset("mozilla-foundation/common_voice_16_1", "ru", split='validation', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
+common_voice_valid_en = load_dataset("mozilla-foundation/common_voice_16_1", "en", split='validation', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
+# code to shuffle and to take limited size of data
+...
+# concatenate 3 datasets
+common_voice['train'] = concatenate_datasets([common_voice_train_uz, common_voice_train_ru, common_voice_train_en])
+```
+## Training procedure
+Used Trainer from transformers
 ### Training hyperparameters
 - Transformers 4.38.2
 - Pytorch 2.2.1+cu121
 - Datasets 2.17.1
+- Tokenizers 0.15.2