Automatic Speech Recognition
ESPnet
multilingual
audio
speech-translation
language-identification
pyf98 commited on
Commit
a58c2f2
·
verified ·
1 Parent(s): 920e145

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -1
README.md CHANGED
@@ -155,7 +155,44 @@ segments = aligner(speech, text)
155
  print(segments)
156
  ```
157
 
158
- ## Citations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
159
 
160
  #### OWSM-CTC
161
 
 
155
  print(segments)
156
  ```
157
 
158
+
159
+ ### OWSM series
160
+
161
+ #### Encoder-decoder OWSM
162
+
163
+ | Name | Size | Hugging Face Repo |
164
+ | :--- | ---: | :---------------- |
165
+ | OWSM v3.1 base | 101M | https://huggingface.co/espnet/owsm_v3.1_ebf_base |
166
+ | OWSM v3.1 small | 367M | https://huggingface.co/espnet/owsm_v3.1_ebf_small |
167
+ | OWSM v3.1 medium | 1.02B | https://huggingface.co/espnet/owsm_v3.1_ebf |
168
+ | OWSM v3.2 small | 367M | https://huggingface.co/espnet/owsm_v3.2 |
169
+ | OWSM v4 base | 102M | https://huggingface.co/espnet/owsm_v4_base_102M |
170
+ | OWSM v4 small | 370M | https://huggingface.co/espnet/owsm_v4_small_370M |
171
+ | OWSM v4 medium | 1.02B | https://huggingface.co/espnet/owsm_v4_medium_1B |
172
+
173
+
174
+ #### CTC-based OWSM
175
+
176
+ | Name | Size | Hugging Face Repo |
177
+ | :--- | ---: | :---------------- |
178
+ | OWSM-CTC v3.1 medium | 1.01B | https://huggingface.co/espnet/owsm_ctc_v3.1_1B |
179
+ | OWSM-CTC v3.2 medium | 1.01B | https://huggingface.co/espnet/owsm_ctc_v3.2_ft_1B |
180
+ | OWSM-CTC v4 medium | 1.01B | https://huggingface.co/espnet/owsm_ctc_v4_1B |
181
+
182
+
183
+
184
+ ### Citations
185
+
186
+ #### OWSM v4
187
+
188
+ ```BibTex
189
+ @inproceedings{owsm-v4,
190
+ title={{OWSM} v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning},
191
+ author={Yifan Peng and Shakeel Muhammad and Yui Sudo and William Chen and Jinchuan Tian and Chyi-Jiunn Lin and Shinji Watanabe},
192
+ booktitle={Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH) (accepted)},
193
+ year={2025},
194
+ }
195
+ ```
196
 
197
  #### OWSM-CTC
198