Propicto
/

asr-wav2vec2-commonvoice-15-fr

@@ -24,9 +24,9 @@ tags:
 *asr-wav2vec2-commonvoice-15-fr* is an Automatic Speech Recognition model fine-tuned on CommonVoice 15.0 French set with *LeBenchmark/wav2vec2-FR-7K-large* as the pretrained wav2vec2 model.
 The fine-tuned model achieves the following performance :
-| Release | Valid WER | Test WER | GPUs |
-|:-------------:|:--------------:|:--------------:| :--------:|
-| 2023-09-08 | 9.14  | 11.21  | 4xV100 32GB |
 ## Model Details
@@ -44,8 +44,35 @@ PROPICTO ANR-20-CE93-0005
 - **License:** Apache-2.0
 - **Finetuned from model:** LeBenchmark/wav2vec2-FR-7K-large
-## How to Get Started with the Model
 ## Training Details
@@ -64,69 +91,25 @@ The `common_voice_prepare.py` script handles the preprocessing of the dataset.
 #### Training Hyperparameters
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
 #### Software
-[More Information Needed]
 ## Citation

 *asr-wav2vec2-commonvoice-15-fr* is an Automatic Speech Recognition model fine-tuned on CommonVoice 15.0 French set with *LeBenchmark/wav2vec2-FR-7K-large* as the pretrained wav2vec2 model.
 The fine-tuned model achieves the following performance :
+| Release | Valid WER | Test WER | GPUs | Epochs
+|:-------------:|:--------------:|:--------------:| :--------:|:--------:|
+| 2023-09-08 | 9.14  | 11.21  | 4xV100 32GB | 30 |
 ## Model Details
 - **License:** Apache-2.0
 - **Finetuned from model:** LeBenchmark/wav2vec2-FR-7K-large
+## How to transcribe a file with the model
+### Install and import speechbrain
+```bash
+pip install speechbrain
+```
+```python
+from speechbrain.inference.ASR import EncoderASR
+```
+### Pipeline
+```python
+def transcribe(audio, model):
+    return model.transcribe_file(audio).lower()
+def save_transcript(transcript, audio, output_file):
+    with open(output_file, 'w', encoding='utf-8') as file:
+        file.write(f"{audio}\t{transcript}\n")
+def main():
+    model = EncoderASR.from_hparams(model_wav2vec2, savedir="tmp/")
+    transcript = transcribe(audio, model)
+    save_transcript(transcript, audio, "out.txt")
+```
 ## Training Details
 #### Training Hyperparameters
+Refer to the hyperparams.yaml file to get the hyperparameters information.
+#### Training time
+With 4xV100 32GB, the training took ~ 81 hours.
 #### Software
+(Speechbrain)[https://speechbrain.github.io/]:
+```bibtex
+@misc{SB2021,
+    author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
+    title = {SpeechBrain},
+    year = {2021},
+    publisher = {GitHub},
+    journal = {GitHub repository},
+    howpublished = {\\\\url{https://github.com/speechbrain/speechbrain}},
+  }
+```
 ## Citation