Update README.md
Browse files
README.md
CHANGED
@@ -148,7 +148,7 @@ Conformer-Transducer model is an autoregressive variant of Conformer model [1] f
|
|
148 |
|
149 |
The NeMo toolkit [3] was used for training the models for over several hundred epochs. These model are trained with this [example script](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/asr_transducer/speech_to_text_rnnt_bpe.py) and this [base config](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/conf/conformer/conformer_transducer_bpe.yaml).
|
150 |
|
151 |
-
The tokenizers for these models were built using the text transcripts of the train set with this [script](https://github.com/NVIDIA/NeMo/blob/main/scripts/tokenizers/process_asr_text_tokenizer.py).
|
152 |
|
153 |
## Datasets
|
154 |
All the models in this collection are trained on a composite dataset (NeMo ASRSET) comprising of over a thousand hours of French speech:
|
@@ -177,5 +177,12 @@ Since this model was trained on publicly available speech datasets, the performa
|
|
177 |
Further, since portions of the training set contain text from both pre- and post- 1990 orthographic reform, regularity of punctuation may vary between the two styles.
|
178 |
For downstream tasks requiring more consistency, finetuning or downstream processing may be required. If exact orthography is not necessary, then using secondary model is advised.
|
179 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
180 |
|
181 |
|
|
|
148 |
|
149 |
The NeMo toolkit [3] was used for training the models for over several hundred epochs. These model are trained with this [example script](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/asr_transducer/speech_to_text_rnnt_bpe.py) and this [base config](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/conf/conformer/conformer_transducer_bpe.yaml).
|
150 |
|
151 |
+
The sentence-piece tokenizers [2] for these models were built using the text transcripts of the train set with this [script](https://github.com/NVIDIA/NeMo/blob/main/scripts/tokenizers/process_asr_text_tokenizer.py).
|
152 |
|
153 |
## Datasets
|
154 |
All the models in this collection are trained on a composite dataset (NeMo ASRSET) comprising of over a thousand hours of French speech:
|
|
|
177 |
Further, since portions of the training set contain text from both pre- and post- 1990 orthographic reform, regularity of punctuation may vary between the two styles.
|
178 |
For downstream tasks requiring more consistency, finetuning or downstream processing may be required. If exact orthography is not necessary, then using secondary model is advised.
|
179 |
|
180 |
+
## References
|
181 |
+
|
182 |
+
- [1] [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100)
|
183 |
+
|
184 |
+
- [2] [Google Sentencepiece Tokenizer](https://github.com/google/sentencepiece)
|
185 |
+
|
186 |
+
- [3] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
|
187 |
|
188 |
|