harshit345
/

xlsr_wav2vec_english

Automatic Speech Recognition

xlsr-fine-tuning-week

Inference Endpoints

Model card Files Files and versions Community

harshit345 commited on Dec 11, 2021

Commit

0ab6992

·

1 Parent(s): e6cc099

Update README.md

Files changed (1) hide show

README.md +7 -13

README.md CHANGED Viewed

@@ -35,9 +35,8 @@ model-index:
 Fine-tuned [facebook/wav2vec2-large](https://huggingface.co/facebook/wav2vec2-large) on English using the [Common Voice](https://huggingface.co/datasets/common_voice).
 When using this model, make sure that your speech input is sampled at 16kHz.
-This model has been fine-tuned thanks to the GPU credits generously given by the [OVHcloud](https://www.ovhcloud.com/en/public-cloud/ai-training/) :)
-The script used for training can be found here: https://github.com/jonatasgrosman/wav2vec2-sprint
 ## Usage
@@ -174,18 +173,13 @@ print(f"CER: {cer.compute(predictions=predictions, references=references, chunk_
 **Test Result**:
-In the table below I report the Word Error Rate (WER) and the Character Error Rate (CER) of the model. I ran the evaluation script described above on other models as well (on 2021-06-17). Note that the table below may show different results from those already reported, this may have been caused due to some specificity of the other evaluation scripts used.
 | Model | WER | CER |
 | ------------- | ------------- | ------------- |
-| jonatasgrosman/wav2vec2-large-xlsr-53-english | **18.98%** | **8.29%** |
-| jonatasgrosman/wav2vec2-large-english | 21.53% | 9.66% |
 | facebook/wav2vec2-large-960h-lv60-self | 22.03% | 10.39% |
-| facebook/wav2vec2-large-960h-lv60 | 23.97% | 11.14% |
-| boris/xlsr-en-punctuation | 29.10% | 10.75% |
-| facebook/wav2vec2-large-960h | 32.79% | 16.03% |
-| facebook/wav2vec2-base-960h | 39.86% | 19.89% |
-| facebook/wav2vec2-base-100h | 51.06% | 25.06% |
-| elgeish/wav2vec2-large-lv60-timit-asr | 59.96% | 34.28% |
-| facebook/wav2vec2-base-10k-voxpopuli-ft-en | 66.41% | 36.76% |
-| elgeish/wav2vec2-base-timit-asr | 68.78% | 36.81% |

 Fine-tuned [facebook/wav2vec2-large](https://huggingface.co/facebook/wav2vec2-large) on English using the [Common Voice](https://huggingface.co/datasets/common_voice).
 When using this model, make sure that your speech input is sampled at 16kHz.
 ## Usage
 **Test Result**:
+In the table below I report the Word Error Rate (WER) and the Character Error Rate (CER) of the model. I ran the evaluation script described above on other models as well. Note that the table below may show different results from those already reported, this may have been caused due to some specificity of the other evaluation scripts used.
 | Model | WER | CER |
 | ------------- | ------------- | ------------- |
+| hkatyal345/wav2vec2-large-xlsr-53-english | **18.98%** | **8.29%** |
+| hkatyal345/wav2vec2-large-xlsr-hindi | 20.01% | 9.66% |
+| hkatyal345/wav2vec2-large-english | 22.00% | 9.66% |
 | facebook/wav2vec2-large-960h-lv60-self | 22.03% | 10.39% |
+| facebook/wav2vec2-base-100h-lv60 | 24.97% | 11.14% |
+|