Update README.md
Browse files
README.md
CHANGED
@@ -155,7 +155,44 @@ segments = aligner(speech, text)
|
|
155 |
print(segments)
|
156 |
```
|
157 |
|
158 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
159 |
|
160 |
#### OWSM-CTC
|
161 |
|
|
|
155 |
print(segments)
|
156 |
```
|
157 |
|
158 |
+
|
159 |
+
### OWSM series
|
160 |
+
|
161 |
+
#### Encoder-decoder OWSM
|
162 |
+
|
163 |
+
| Name | Size | Hugging Face Repo |
|
164 |
+
| :--- | ---: | :---------------- |
|
165 |
+
| OWSM v3.1 base | 101M | https://huggingface.co/espnet/owsm_v3.1_ebf_base |
|
166 |
+
| OWSM v3.1 small | 367M | https://huggingface.co/espnet/owsm_v3.1_ebf_small |
|
167 |
+
| OWSM v3.1 medium | 1.02B | https://huggingface.co/espnet/owsm_v3.1_ebf |
|
168 |
+
| OWSM v3.2 small | 367M | https://huggingface.co/espnet/owsm_v3.2 |
|
169 |
+
| OWSM v4 base | 102M | https://huggingface.co/espnet/owsm_v4_base_102M |
|
170 |
+
| OWSM v4 small | 370M | https://huggingface.co/espnet/owsm_v4_small_370M |
|
171 |
+
| OWSM v4 medium | 1.02B | https://huggingface.co/espnet/owsm_v4_medium_1B |
|
172 |
+
|
173 |
+
|
174 |
+
#### CTC-based OWSM
|
175 |
+
|
176 |
+
| Name | Size | Hugging Face Repo |
|
177 |
+
| :--- | ---: | :---------------- |
|
178 |
+
| OWSM-CTC v3.1 medium | 1.01B | https://huggingface.co/espnet/owsm_ctc_v3.1_1B |
|
179 |
+
| OWSM-CTC v3.2 medium | 1.01B | https://huggingface.co/espnet/owsm_ctc_v3.2_ft_1B |
|
180 |
+
| OWSM-CTC v4 medium | 1.01B | https://huggingface.co/espnet/owsm_ctc_v4_1B |
|
181 |
+
|
182 |
+
|
183 |
+
|
184 |
+
### Citations
|
185 |
+
|
186 |
+
#### OWSM v4
|
187 |
+
|
188 |
+
```BibTex
|
189 |
+
@inproceedings{owsm-v4,
|
190 |
+
title={{OWSM} v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning},
|
191 |
+
author={Yifan Peng and Shakeel Muhammad and Yui Sudo and William Chen and Jinchuan Tian and Chyi-Jiunn Lin and Shinji Watanabe},
|
192 |
+
booktitle={Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH) (accepted)},
|
193 |
+
year={2025},
|
194 |
+
}
|
195 |
+
```
|
196 |
|
197 |
#### OWSM-CTC
|
198 |
|