Safetensors
wav2vec2
mms
Coco-18 commited on
Commit
5597385
·
verified ·
1 Parent(s): 12cf57b

Upload 14 files

Browse files
README.md ADDED
@@ -0,0 +1,1500 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - mms
4
+ language:
5
+ - ab
6
+ - af
7
+ - ak
8
+ - am
9
+ - ar
10
+ - as
11
+ - av
12
+ - ay
13
+ - az
14
+ - ba
15
+ - bm
16
+ - be
17
+ - bn
18
+ - bi
19
+ - bo
20
+ - sh
21
+ - br
22
+ - bg
23
+ - ca
24
+ - cs
25
+ - ce
26
+ - cv
27
+ - ku
28
+ - cy
29
+ - da
30
+ - de
31
+ - dv
32
+ - dz
33
+ - el
34
+ - en
35
+ - eo
36
+ - et
37
+ - eu
38
+ - ee
39
+ - fo
40
+ - fa
41
+ - fj
42
+ - fi
43
+ - fr
44
+ - fy
45
+ - ff
46
+ - ga
47
+ - gl
48
+ - gn
49
+ - gu
50
+ - zh
51
+ - ht
52
+ - ha
53
+ - he
54
+ - hi
55
+ - sh
56
+ - hu
57
+ - hy
58
+ - ig
59
+ - ia
60
+ - ms
61
+ - is
62
+ - it
63
+ - jv
64
+ - ja
65
+ - kn
66
+ - ka
67
+ - kk
68
+ - kr
69
+ - km
70
+ - ki
71
+ - rw
72
+ - ky
73
+ - ko
74
+ - kv
75
+ - lo
76
+ - la
77
+ - lv
78
+ - ln
79
+ - lt
80
+ - lb
81
+ - lg
82
+ - mh
83
+ - ml
84
+ - mr
85
+ - ms
86
+ - mk
87
+ - mg
88
+ - mt
89
+ - mn
90
+ - mi
91
+ - my
92
+ - zh
93
+ - nl
94
+ - 'no'
95
+ - 'no'
96
+ - ne
97
+ - ny
98
+ - oc
99
+ - om
100
+ - or
101
+ - os
102
+ - pa
103
+ - pl
104
+ - pt
105
+ - ms
106
+ - ps
107
+ - qu
108
+ - qu
109
+ - qu
110
+ - qu
111
+ - qu
112
+ - qu
113
+ - qu
114
+ - qu
115
+ - qu
116
+ - qu
117
+ - qu
118
+ - qu
119
+ - qu
120
+ - qu
121
+ - qu
122
+ - qu
123
+ - qu
124
+ - qu
125
+ - qu
126
+ - qu
127
+ - qu
128
+ - qu
129
+ - ro
130
+ - rn
131
+ - ru
132
+ - sg
133
+ - sk
134
+ - sl
135
+ - sm
136
+ - sn
137
+ - sd
138
+ - so
139
+ - es
140
+ - sq
141
+ - su
142
+ - sv
143
+ - sw
144
+ - ta
145
+ - tt
146
+ - te
147
+ - tg
148
+ - tl
149
+ - th
150
+ - ti
151
+ - ts
152
+ - tr
153
+ - uk
154
+ - ms
155
+ - vi
156
+ - wo
157
+ - xh
158
+ - ms
159
+ - yo
160
+ - ms
161
+ - zu
162
+ - za
163
+ license: cc-by-nc-4.0
164
+ datasets:
165
+ - google/fleurs
166
+ metrics:
167
+ - wer
168
+ ---
169
+
170
+ # Massively Multilingual Speech (MMS) - Finetuned ASR - ALL
171
+
172
+ This checkpoint is a model fine-tuned for multi-lingual ASR and part of Facebook's [Massive Multilingual Speech project](https://research.facebook.com/publications/scaling-speech-technology-to-1000-languages/).
173
+ This checkpoint is based on the [Wav2Vec2 architecture](https://huggingface.co/docs/transformers/model_doc/wav2vec2) and makes use of adapter models to transcribe 1000+ languages.
174
+ The checkpoint consists of **1 billion parameters** and has been fine-tuned from [facebook/mms-1b](https://huggingface.co/facebook/mms-1b) on 1162 languages.
175
+
176
+ ## Table Of Content
177
+
178
+ - [Example](#example)
179
+ - [Supported Languages](#supported-languages)
180
+ - [Model details](#model-details)
181
+ - [Additional links](#additional-links)
182
+
183
+ ## Example
184
+
185
+ This MMS checkpoint can be used with [Transformers](https://github.com/huggingface/transformers) to transcribe audio of 1107 different
186
+ languages. Let's look at a simple example.
187
+
188
+ First, we install transformers and some other libraries
189
+ ```
190
+ pip install torch accelerate torchaudio datasets
191
+ pip install --upgrade transformers
192
+ ````
193
+
194
+ **Note**: In order to use MMS you need to have at least `transformers >= 4.30` installed. If the `4.30` version
195
+ is not yet available [on PyPI](https://pypi.org/project/transformers/) make sure to install `transformers` from
196
+ source:
197
+ ```
198
+ pip install git+https://github.com/huggingface/transformers.git
199
+ ```
200
+
201
+ Next, we load a couple of audio samples via `datasets`. Make sure that the audio data is sampled to 16000 kHz.
202
+
203
+ ```py
204
+ from datasets import load_dataset, Audio
205
+
206
+ # English
207
+ stream_data = load_dataset("mozilla-foundation/common_voice_13_0", "en", split="test", streaming=True)
208
+ stream_data = stream_data.cast_column("audio", Audio(sampling_rate=16000))
209
+ en_sample = next(iter(stream_data))["audio"]["array"]
210
+
211
+ # French
212
+ stream_data = load_dataset("mozilla-foundation/common_voice_13_0", "fr", split="test", streaming=True)
213
+ stream_data = stream_data.cast_column("audio", Audio(sampling_rate=16000))
214
+ fr_sample = next(iter(stream_data))["audio"]["array"]
215
+ ```
216
+
217
+ Next, we load the model and processor
218
+
219
+ ```py
220
+ from transformers import Wav2Vec2ForCTC, AutoProcessor
221
+ import torch
222
+
223
+ model_id = "facebook/mms-1b-all"
224
+
225
+ processor = AutoProcessor.from_pretrained(model_id)
226
+ model = Wav2Vec2ForCTC.from_pretrained(model_id)
227
+ ```
228
+
229
+ Now we process the audio data, pass the processed audio data to the model and transcribe the model output, just like we usually do for Wav2Vec2 models such as [facebook/wav2vec2-base-960h](https://huggingface.co/facebook/wav2vec2-base-960h)
230
+
231
+ ```py
232
+ inputs = processor(en_sample, sampling_rate=16_000, return_tensors="pt")
233
+
234
+ with torch.no_grad():
235
+ outputs = model(**inputs).logits
236
+
237
+ ids = torch.argmax(outputs, dim=-1)[0]
238
+ transcription = processor.decode(ids)
239
+ # 'joe keton disapproved of films and buster also had reservations about the media'
240
+ ```
241
+
242
+ We can now keep the same model in memory and simply switch out the language adapters by calling the convenient [`load_adapter()`]() function for the model and [`set_target_lang()`]() for the tokenizer. We pass the target language as an input - "fra" for French.
243
+
244
+ ```py
245
+ processor.tokenizer.set_target_lang("fra")
246
+ model.load_adapter("fra")
247
+
248
+ inputs = processor(fr_sample, sampling_rate=16_000, return_tensors="pt")
249
+
250
+ with torch.no_grad():
251
+ outputs = model(**inputs).logits
252
+
253
+ ids = torch.argmax(outputs, dim=-1)[0]
254
+ transcription = processor.decode(ids)
255
+ # "ce dernier est volé tout au long de l'histoire romaine"
256
+ ```
257
+
258
+ In the same way the language can be switched out for all other supported languages. Please have a look at:
259
+ ```py
260
+ processor.tokenizer.vocab.keys()
261
+ ```
262
+
263
+ For more details, please have a look at [the official docs](https://huggingface.co/docs/transformers/main/en/model_doc/mms).
264
+
265
+ ## Supported Languages
266
+
267
+ This model supports 1162 languages. Unclick the following to toogle all supported languages of this checkpoint in [ISO 639-3 code](https://en.wikipedia.org/wiki/ISO_639-3).
268
+ You can find more details about the languages and their ISO 649-3 codes in the [MMS Language Coverage Overview](https://dl.fbaipublicfiles.com/mms/misc/language_coverage_mms.html).
269
+ <details>
270
+ <summary>Click to toggle</summary>
271
+
272
+ - abi
273
+ - abk
274
+ - abp
275
+ - aca
276
+ - acd
277
+ - ace
278
+ - acf
279
+ - ach
280
+ - acn
281
+ - acr
282
+ - acu
283
+ - ade
284
+ - adh
285
+ - adj
286
+ - adx
287
+ - aeu
288
+ - afr
289
+ - agd
290
+ - agg
291
+ - agn
292
+ - agr
293
+ - agu
294
+ - agx
295
+ - aha
296
+ - ahk
297
+ - aia
298
+ - aka
299
+ - akb
300
+ - ake
301
+ - akp
302
+ - alj
303
+ - alp
304
+ - alt
305
+ - alz
306
+ - ame
307
+ - amf
308
+ - amh
309
+ - ami
310
+ - amk
311
+ - ann
312
+ - any
313
+ - aoz
314
+ - apb
315
+ - apr
316
+ - ara
317
+ - arl
318
+ - asa
319
+ - asg
320
+ - asm
321
+ - ast
322
+ - ata
323
+ - atb
324
+ - atg
325
+ - ati
326
+ - atq
327
+ - ava
328
+ - avn
329
+ - avu
330
+ - awa
331
+ - awb
332
+ - ayo
333
+ - ayr
334
+ - ayz
335
+ - azb
336
+ - azg
337
+ - azj-script_cyrillic
338
+ - azj-script_latin
339
+ - azz
340
+ - bak
341
+ - bam
342
+ - ban
343
+ - bao
344
+ - bas
345
+ - bav
346
+ - bba
347
+ - bbb
348
+ - bbc
349
+ - bbo
350
+ - bcc-script_arabic
351
+ - bcc-script_latin
352
+ - bcl
353
+ - bcw
354
+ - bdg
355
+ - bdh
356
+ - bdq
357
+ - bdu
358
+ - bdv
359
+ - beh
360
+ - bel
361
+ - bem
362
+ - ben
363
+ - bep
364
+ - bex
365
+ - bfa
366
+ - bfo
367
+ - bfy
368
+ - bfz
369
+ - bgc
370
+ - bgq
371
+ - bgr
372
+ - bgt
373
+ - bgw
374
+ - bha
375
+ - bht
376
+ - bhz
377
+ - bib
378
+ - bim
379
+ - bis
380
+ - biv
381
+ - bjr
382
+ - bjv
383
+ - bjw
384
+ - bjz
385
+ - bkd
386
+ - bkv
387
+ - blh
388
+ - blt
389
+ - blx
390
+ - blz
391
+ - bmq
392
+ - bmr
393
+ - bmu
394
+ - bmv
395
+ - bng
396
+ - bno
397
+ - bnp
398
+ - boa
399
+ - bod
400
+ - boj
401
+ - bom
402
+ - bor
403
+ - bos
404
+ - bov
405
+ - box
406
+ - bpr
407
+ - bps
408
+ - bqc
409
+ - bqi
410
+ - bqj
411
+ - bqp
412
+ - bre
413
+ - bru
414
+ - bsc
415
+ - bsq
416
+ - bss
417
+ - btd
418
+ - bts
419
+ - btt
420
+ - btx
421
+ - bud
422
+ - bul
423
+ - bus
424
+ - bvc
425
+ - bvz
426
+ - bwq
427
+ - bwu
428
+ - byr
429
+ - bzh
430
+ - bzi
431
+ - bzj
432
+ - caa
433
+ - cab
434
+ - cac-dialect_sanmateoixtatan
435
+ - cac-dialect_sansebastiancoatan
436
+ - cak-dialect_central
437
+ - cak-dialect_santamariadejesus
438
+ - cak-dialect_santodomingoxenacoj
439
+ - cak-dialect_southcentral
440
+ - cak-dialect_western
441
+ - cak-dialect_yepocapa
442
+ - cap
443
+ - car
444
+ - cas
445
+ - cat
446
+ - cax
447
+ - cbc
448
+ - cbi
449
+ - cbr
450
+ - cbs
451
+ - cbt
452
+ - cbu
453
+ - cbv
454
+ - cce
455
+ - cco
456
+ - cdj
457
+ - ceb
458
+ - ceg
459
+ - cek
460
+ - ces
461
+ - cfm
462
+ - cgc
463
+ - che
464
+ - chf
465
+ - chv
466
+ - chz
467
+ - cjo
468
+ - cjp
469
+ - cjs
470
+ - ckb
471
+ - cko
472
+ - ckt
473
+ - cla
474
+ - cle
475
+ - cly
476
+ - cme
477
+ - cmn-script_simplified
478
+ - cmo-script_khmer
479
+ - cmo-script_latin
480
+ - cmr
481
+ - cnh
482
+ - cni
483
+ - cnl
484
+ - cnt
485
+ - coe
486
+ - cof
487
+ - cok
488
+ - con
489
+ - cot
490
+ - cou
491
+ - cpa
492
+ - cpb
493
+ - cpu
494
+ - crh
495
+ - crk-script_latin
496
+ - crk-script_syllabics
497
+ - crn
498
+ - crq
499
+ - crs
500
+ - crt
501
+ - csk
502
+ - cso
503
+ - ctd
504
+ - ctg
505
+ - cto
506
+ - ctu
507
+ - cuc
508
+ - cui
509
+ - cuk
510
+ - cul
511
+ - cwa
512
+ - cwe
513
+ - cwt
514
+ - cya
515
+ - cym
516
+ - daa
517
+ - dah
518
+ - dan
519
+ - dar
520
+ - dbj
521
+ - dbq
522
+ - ddn
523
+ - ded
524
+ - des
525
+ - deu
526
+ - dga
527
+ - dgi
528
+ - dgk
529
+ - dgo
530
+ - dgr
531
+ - dhi
532
+ - did
533
+ - dig
534
+ - dik
535
+ - dip
536
+ - div
537
+ - djk
538
+ - dnj-dialect_blowowest
539
+ - dnj-dialect_gweetaawueast
540
+ - dnt
541
+ - dnw
542
+ - dop
543
+ - dos
544
+ - dsh
545
+ - dso
546
+ - dtp
547
+ - dts
548
+ - dug
549
+ - dwr
550
+ - dyi
551
+ - dyo
552
+ - dyu
553
+ - dzo
554
+ - eip
555
+ - eka
556
+ - ell
557
+ - emp
558
+ - enb
559
+ - eng
560
+ - enx
561
+ - epo
562
+ - ese
563
+ - ess
564
+ - est
565
+ - eus
566
+ - evn
567
+ - ewe
568
+ - eza
569
+ - fal
570
+ - fao
571
+ - far
572
+ - fas
573
+ - fij
574
+ - fin
575
+ - flr
576
+ - fmu
577
+ - fon
578
+ - fra
579
+ - frd
580
+ - fry
581
+ - ful
582
+ - gag-script_cyrillic
583
+ - gag-script_latin
584
+ - gai
585
+ - gam
586
+ - gau
587
+ - gbi
588
+ - gbk
589
+ - gbm
590
+ - gbo
591
+ - gde
592
+ - geb
593
+ - gej
594
+ - gil
595
+ - gjn
596
+ - gkn
597
+ - gld
598
+ - gle
599
+ - glg
600
+ - glk
601
+ - gmv
602
+ - gna
603
+ - gnd
604
+ - gng
605
+ - gof-script_latin
606
+ - gog
607
+ - gor
608
+ - gqr
609
+ - grc
610
+ - gri
611
+ - grn
612
+ - grt
613
+ - gso
614
+ - gub
615
+ - guc
616
+ - gud
617
+ - guh
618
+ - guj
619
+ - guk
620
+ - gum
621
+ - guo
622
+ - guq
623
+ - guu
624
+ - gux
625
+ - gvc
626
+ - gvl
627
+ - gwi
628
+ - gwr
629
+ - gym
630
+ - gyr
631
+ - had
632
+ - hag
633
+ - hak
634
+ - hap
635
+ - hat
636
+ - hau
637
+ - hay
638
+ - heb
639
+ - heh
640
+ - hif
641
+ - hig
642
+ - hil
643
+ - hin
644
+ - hlb
645
+ - hlt
646
+ - hne
647
+ - hnn
648
+ - hns
649
+ - hoc
650
+ - hoy
651
+ - hrv
652
+ - hsb
653
+ - hto
654
+ - hub
655
+ - hui
656
+ - hun
657
+ - hus-dialect_centralveracruz
658
+ - hus-dialect_westernpotosino
659
+ - huu
660
+ - huv
661
+ - hvn
662
+ - hwc
663
+ - hye
664
+ - hyw
665
+ - iba
666
+ - ibo
667
+ - icr
668
+ - idd
669
+ - ifa
670
+ - ifb
671
+ - ife
672
+ - ifk
673
+ - ifu
674
+ - ify
675
+ - ign
676
+ - ikk
677
+ - ilb
678
+ - ilo
679
+ - imo
680
+ - ina
681
+ - inb
682
+ - ind
683
+ - iou
684
+ - ipi
685
+ - iqw
686
+ - iri
687
+ - irk
688
+ - isl
689
+ - ita
690
+ - itl
691
+ - itv
692
+ - ixl-dialect_sangasparchajul
693
+ - ixl-dialect_sanjuancotzal
694
+ - ixl-dialect_santamarianebaj
695
+ - izr
696
+ - izz
697
+ - jac
698
+ - jam
699
+ - jav
700
+ - jbu
701
+ - jen
702
+ - jic
703
+ - jiv
704
+ - jmc
705
+ - jmd
706
+ - jpn
707
+ - jun
708
+ - juy
709
+ - jvn
710
+ - kaa
711
+ - kab
712
+ - kac
713
+ - kak
714
+ - kam
715
+ - kan
716
+ - kao
717
+ - kaq
718
+ - kat
719
+ - kay
720
+ - kaz
721
+ - kbo
722
+ - kbp
723
+ - kbq
724
+ - kbr
725
+ - kby
726
+ - kca
727
+ - kcg
728
+ - kdc
729
+ - kde
730
+ - kdh
731
+ - kdi
732
+ - kdj
733
+ - kdl
734
+ - kdn
735
+ - kdt
736
+ - kea
737
+ - kek
738
+ - ken
739
+ - keo
740
+ - ker
741
+ - key
742
+ - kez
743
+ - kfb
744
+ - kff-script_telugu
745
+ - kfw
746
+ - kfx
747
+ - khg
748
+ - khm
749
+ - khq
750
+ - kia
751
+ - kij
752
+ - kik
753
+ - kin
754
+ - kir
755
+ - kjb
756
+ - kje
757
+ - kjg
758
+ - kjh
759
+ - kki
760
+ - kkj
761
+ - kle
762
+ - klu
763
+ - klv
764
+ - klw
765
+ - kma
766
+ - kmd
767
+ - kml
768
+ - kmr-script_arabic
769
+ - kmr-script_cyrillic
770
+ - kmr-script_latin
771
+ - kmu
772
+ - knb
773
+ - kne
774
+ - knf
775
+ - knj
776
+ - knk
777
+ - kno
778
+ - kog
779
+ - kor
780
+ - kpq
781
+ - kps
782
+ - kpv
783
+ - kpy
784
+ - kpz
785
+ - kqe
786
+ - kqp
787
+ - kqr
788
+ - kqy
789
+ - krc
790
+ - kri
791
+ - krj
792
+ - krl
793
+ - krr
794
+ - krs
795
+ - kru
796
+ - ksb
797
+ - ksr
798
+ - kss
799
+ - ktb
800
+ - ktj
801
+ - kub
802
+ - kue
803
+ - kum
804
+ - kus
805
+ - kvn
806
+ - kvw
807
+ - kwd
808
+ - kwf
809
+ - kwi
810
+ - kxc
811
+ - kxf
812
+ - kxm
813
+ - kxv
814
+ - kyb
815
+ - kyc
816
+ - kyf
817
+ - kyg
818
+ - kyo
819
+ - kyq
820
+ - kyu
821
+ - kyz
822
+ - kzf
823
+ - lac
824
+ - laj
825
+ - lam
826
+ - lao
827
+ - las
828
+ - lat
829
+ - lav
830
+ - law
831
+ - lbj
832
+ - lbw
833
+ - lcp
834
+ - lee
835
+ - lef
836
+ - lem
837
+ - lew
838
+ - lex
839
+ - lgg
840
+ - lgl
841
+ - lhu
842
+ - lia
843
+ - lid
844
+ - lif
845
+ - lin
846
+ - lip
847
+ - lis
848
+ - lit
849
+ - lje
850
+ - ljp
851
+ - llg
852
+ - lln
853
+ - lme
854
+ - lnd
855
+ - lns
856
+ - lob
857
+ - lok
858
+ - lom
859
+ - lon
860
+ - loq
861
+ - lsi
862
+ - lsm
863
+ - ltz
864
+ - luc
865
+ - lug
866
+ - luo
867
+ - lwo
868
+ - lww
869
+ - lzz
870
+ - maa-dialect_sanantonio
871
+ - maa-dialect_sanjeronimo
872
+ - mad
873
+ - mag
874
+ - mah
875
+ - mai
876
+ - maj
877
+ - mak
878
+ - mal
879
+ - mam-dialect_central
880
+ - mam-dialect_northern
881
+ - mam-dialect_southern
882
+ - mam-dialect_western
883
+ - maq
884
+ - mar
885
+ - maw
886
+ - maz
887
+ - mbb
888
+ - mbc
889
+ - mbh
890
+ - mbj
891
+ - mbt
892
+ - mbu
893
+ - mbz
894
+ - mca
895
+ - mcb
896
+ - mcd
897
+ - mco
898
+ - mcp
899
+ - mcq
900
+ - mcu
901
+ - mda
902
+ - mdf
903
+ - mdv
904
+ - mdy
905
+ - med
906
+ - mee
907
+ - mej
908
+ - men
909
+ - meq
910
+ - met
911
+ - mev
912
+ - mfe
913
+ - mfh
914
+ - mfi
915
+ - mfk
916
+ - mfq
917
+ - mfy
918
+ - mfz
919
+ - mgd
920
+ - mge
921
+ - mgh
922
+ - mgo
923
+ - mhi
924
+ - mhr
925
+ - mhu
926
+ - mhx
927
+ - mhy
928
+ - mib
929
+ - mie
930
+ - mif
931
+ - mih
932
+ - mil
933
+ - mim
934
+ - min
935
+ - mio
936
+ - mip
937
+ - miq
938
+ - mit
939
+ - miy
940
+ - miz
941
+ - mjl
942
+ - mjv
943
+ - mkd
944
+ - mkl
945
+ - mkn
946
+ - mlg
947
+ - mlt
948
+ - mmg
949
+ - mnb
950
+ - mnf
951
+ - mnk
952
+ - mnw
953
+ - mnx
954
+ - moa
955
+ - mog
956
+ - mon
957
+ - mop
958
+ - mor
959
+ - mos
960
+ - mox
961
+ - moz
962
+ - mpg
963
+ - mpm
964
+ - mpp
965
+ - mpx
966
+ - mqb
967
+ - mqf
968
+ - mqj
969
+ - mqn
970
+ - mri
971
+ - mrw
972
+ - msy
973
+ - mtd
974
+ - mtj
975
+ - mto
976
+ - muh
977
+ - mup
978
+ - mur
979
+ - muv
980
+ - muy
981
+ - mvp
982
+ - mwq
983
+ - mwv
984
+ - mxb
985
+ - mxq
986
+ - mxt
987
+ - mxv
988
+ - mya
989
+ - myb
990
+ - myk
991
+ - myl
992
+ - myv
993
+ - myx
994
+ - myy
995
+ - mza
996
+ - mzi
997
+ - mzj
998
+ - mzk
999
+ - mzm
1000
+ - mzw
1001
+ - nab
1002
+ - nag
1003
+ - nan
1004
+ - nas
1005
+ - naw
1006
+ - nca
1007
+ - nch
1008
+ - ncj
1009
+ - ncl
1010
+ - ncu
1011
+ - ndj
1012
+ - ndp
1013
+ - ndv
1014
+ - ndy
1015
+ - ndz
1016
+ - neb
1017
+ - new
1018
+ - nfa
1019
+ - nfr
1020
+ - nga
1021
+ - ngl
1022
+ - ngp
1023
+ - ngu
1024
+ - nhe
1025
+ - nhi
1026
+ - nhu
1027
+ - nhw
1028
+ - nhx
1029
+ - nhy
1030
+ - nia
1031
+ - nij
1032
+ - nim
1033
+ - nin
1034
+ - nko
1035
+ - nlc
1036
+ - nld
1037
+ - nlg
1038
+ - nlk
1039
+ - nmz
1040
+ - nnb
1041
+ - nno
1042
+ - nnq
1043
+ - nnw
1044
+ - noa
1045
+ - nob
1046
+ - nod
1047
+ - nog
1048
+ - not
1049
+ - npi
1050
+ - npl
1051
+ - npy
1052
+ - nso
1053
+ - nst
1054
+ - nsu
1055
+ - ntm
1056
+ - ntr
1057
+ - nuj
1058
+ - nus
1059
+ - nuz
1060
+ - nwb
1061
+ - nxq
1062
+ - nya
1063
+ - nyf
1064
+ - nyn
1065
+ - nyo
1066
+ - nyy
1067
+ - nzi
1068
+ - obo
1069
+ - oci
1070
+ - ojb-script_latin
1071
+ - ojb-script_syllabics
1072
+ - oku
1073
+ - old
1074
+ - omw
1075
+ - onb
1076
+ - ood
1077
+ - orm
1078
+ - ory
1079
+ - oss
1080
+ - ote
1081
+ - otq
1082
+ - ozm
1083
+ - pab
1084
+ - pad
1085
+ - pag
1086
+ - pam
1087
+ - pan
1088
+ - pao
1089
+ - pap
1090
+ - pau
1091
+ - pbb
1092
+ - pbc
1093
+ - pbi
1094
+ - pce
1095
+ - pcm
1096
+ - peg
1097
+ - pez
1098
+ - pib
1099
+ - pil
1100
+ - pir
1101
+ - pis
1102
+ - pjt
1103
+ - pkb
1104
+ - pls
1105
+ - plw
1106
+ - pmf
1107
+ - pny
1108
+ - poh-dialect_eastern
1109
+ - poh-dialect_western
1110
+ - poi
1111
+ - pol
1112
+ - por
1113
+ - poy
1114
+ - ppk
1115
+ - pps
1116
+ - prf
1117
+ - prk
1118
+ - prt
1119
+ - pse
1120
+ - pss
1121
+ - ptu
1122
+ - pui
1123
+ - pus
1124
+ - pwg
1125
+ - pww
1126
+ - pxm
1127
+ - qub
1128
+ - quc-dialect_central
1129
+ - quc-dialect_east
1130
+ - quc-dialect_north
1131
+ - quf
1132
+ - quh
1133
+ - qul
1134
+ - quw
1135
+ - quy
1136
+ - quz
1137
+ - qvc
1138
+ - qve
1139
+ - qvh
1140
+ - qvm
1141
+ - qvn
1142
+ - qvo
1143
+ - qvs
1144
+ - qvw
1145
+ - qvz
1146
+ - qwh
1147
+ - qxh
1148
+ - qxl
1149
+ - qxn
1150
+ - qxo
1151
+ - qxr
1152
+ - rah
1153
+ - rai
1154
+ - rap
1155
+ - rav
1156
+ - raw
1157
+ - rej
1158
+ - rel
1159
+ - rgu
1160
+ - rhg
1161
+ - rif-script_arabic
1162
+ - rif-script_latin
1163
+ - ril
1164
+ - rim
1165
+ - rjs
1166
+ - rkt
1167
+ - rmc-script_cyrillic
1168
+ - rmc-script_latin
1169
+ - rmo
1170
+ - rmy-script_cyrillic
1171
+ - rmy-script_latin
1172
+ - rng
1173
+ - rnl
1174
+ - roh-dialect_sursilv
1175
+ - roh-dialect_vallader
1176
+ - rol
1177
+ - ron
1178
+ - rop
1179
+ - rro
1180
+ - rub
1181
+ - ruf
1182
+ - rug
1183
+ - run
1184
+ - rus
1185
+ - sab
1186
+ - sag
1187
+ - sah
1188
+ - saj
1189
+ - saq
1190
+ - sas
1191
+ - sat
1192
+ - sba
1193
+ - sbd
1194
+ - sbl
1195
+ - sbp
1196
+ - sch
1197
+ - sck
1198
+ - sda
1199
+ - sea
1200
+ - seh
1201
+ - ses
1202
+ - sey
1203
+ - sgb
1204
+ - sgj
1205
+ - sgw
1206
+ - shi
1207
+ - shk
1208
+ - shn
1209
+ - sho
1210
+ - shp
1211
+ - sid
1212
+ - sig
1213
+ - sil
1214
+ - sja
1215
+ - sjm
1216
+ - sld
1217
+ - slk
1218
+ - slu
1219
+ - slv
1220
+ - sml
1221
+ - smo
1222
+ - sna
1223
+ - snd
1224
+ - sne
1225
+ - snn
1226
+ - snp
1227
+ - snw
1228
+ - som
1229
+ - soy
1230
+ - spa
1231
+ - spp
1232
+ - spy
1233
+ - sqi
1234
+ - sri
1235
+ - srm
1236
+ - srn
1237
+ - srp-script_cyrillic
1238
+ - srp-script_latin
1239
+ - srx
1240
+ - stn
1241
+ - stp
1242
+ - suc
1243
+ - suk
1244
+ - sun
1245
+ - sur
1246
+ - sus
1247
+ - suv
1248
+ - suz
1249
+ - swe
1250
+ - swh
1251
+ - sxb
1252
+ - sxn
1253
+ - sya
1254
+ - syl
1255
+ - sza
1256
+ - tac
1257
+ - taj
1258
+ - tam
1259
+ - tao
1260
+ - tap
1261
+ - taq
1262
+ - tat
1263
+ - tav
1264
+ - tbc
1265
+ - tbg
1266
+ - tbk
1267
+ - tbl
1268
+ - tby
1269
+ - tbz
1270
+ - tca
1271
+ - tcc
1272
+ - tcs
1273
+ - tcz
1274
+ - tdj
1275
+ - ted
1276
+ - tee
1277
+ - tel
1278
+ - tem
1279
+ - teo
1280
+ - ter
1281
+ - tes
1282
+ - tew
1283
+ - tex
1284
+ - tfr
1285
+ - tgj
1286
+ - tgk
1287
+ - tgl
1288
+ - tgo
1289
+ - tgp
1290
+ - tha
1291
+ - thk
1292
+ - thl
1293
+ - tih
1294
+ - tik
1295
+ - tir
1296
+ - tkr
1297
+ - tlb
1298
+ - tlj
1299
+ - tly
1300
+ - tmc
1301
+ - tmf
1302
+ - tna
1303
+ - tng
1304
+ - tnk
1305
+ - tnn
1306
+ - tnp
1307
+ - tnr
1308
+ - tnt
1309
+ - tob
1310
+ - toc
1311
+ - toh
1312
+ - tom
1313
+ - tos
1314
+ - tpi
1315
+ - tpm
1316
+ - tpp
1317
+ - tpt
1318
+ - trc
1319
+ - tri
1320
+ - trn
1321
+ - trs
1322
+ - tso
1323
+ - tsz
1324
+ - ttc
1325
+ - tte
1326
+ - ttq-script_tifinagh
1327
+ - tue
1328
+ - tuf
1329
+ - tuk-script_arabic
1330
+ - tuk-script_latin
1331
+ - tuo
1332
+ - tur
1333
+ - tvw
1334
+ - twb
1335
+ - twe
1336
+ - twu
1337
+ - txa
1338
+ - txq
1339
+ - txu
1340
+ - tye
1341
+ - tzh-dialect_bachajon
1342
+ - tzh-dialect_tenejapa
1343
+ - tzj-dialect_eastern
1344
+ - tzj-dialect_western
1345
+ - tzo-dialect_chamula
1346
+ - tzo-dialect_chenalho
1347
+ - ubl
1348
+ - ubu
1349
+ - udm
1350
+ - udu
1351
+ - uig-script_arabic
1352
+ - uig-script_cyrillic
1353
+ - ukr
1354
+ - umb
1355
+ - unr
1356
+ - upv
1357
+ - ura
1358
+ - urb
1359
+ - urd-script_arabic
1360
+ - urd-script_devanagari
1361
+ - urd-script_latin
1362
+ - urk
1363
+ - urt
1364
+ - ury
1365
+ - usp
1366
+ - uzb-script_cyrillic
1367
+ - uzb-script_latin
1368
+ - vag
1369
+ - vid
1370
+ - vie
1371
+ - vif
1372
+ - vmw
1373
+ - vmy
1374
+ - vot
1375
+ - vun
1376
+ - vut
1377
+ - wal-script_ethiopic
1378
+ - wal-script_latin
1379
+ - wap
1380
+ - war
1381
+ - waw
1382
+ - way
1383
+ - wba
1384
+ - wlo
1385
+ - wlx
1386
+ - wmw
1387
+ - wob
1388
+ - wol
1389
+ - wsg
1390
+ - wwa
1391
+ - xal
1392
+ - xdy
1393
+ - xed
1394
+ - xer
1395
+ - xho
1396
+ - xmm
1397
+ - xnj
1398
+ - xnr
1399
+ - xog
1400
+ - xon
1401
+ - xrb
1402
+ - xsb
1403
+ - xsm
1404
+ - xsr
1405
+ - xsu
1406
+ - xta
1407
+ - xtd
1408
+ - xte
1409
+ - xtm
1410
+ - xtn
1411
+ - xua
1412
+ - xuo
1413
+ - yaa
1414
+ - yad
1415
+ - yal
1416
+ - yam
1417
+ - yao
1418
+ - yas
1419
+ - yat
1420
+ - yaz
1421
+ - yba
1422
+ - ybb
1423
+ - ycl
1424
+ - ycn
1425
+ - yea
1426
+ - yka
1427
+ - yli
1428
+ - yor
1429
+ - yre
1430
+ - yua
1431
+ - yue-script_traditional
1432
+ - yuz
1433
+ - yva
1434
+ - zaa
1435
+ - zab
1436
+ - zac
1437
+ - zad
1438
+ - zae
1439
+ - zai
1440
+ - zam
1441
+ - zao
1442
+ - zaq
1443
+ - zar
1444
+ - zas
1445
+ - zav
1446
+ - zaw
1447
+ - zca
1448
+ - zga
1449
+ - zim
1450
+ - ziw
1451
+ - zlm
1452
+ - zmz
1453
+ - zne
1454
+ - zos
1455
+ - zpc
1456
+ - zpg
1457
+ - zpi
1458
+ - zpl
1459
+ - zpm
1460
+ - zpo
1461
+ - zpt
1462
+ - zpu
1463
+ - zpz
1464
+ - ztq
1465
+ - zty
1466
+ - zul
1467
+ - zyb
1468
+ - zyp
1469
+ - zza
1470
+
1471
+ </details>
1472
+
1473
+ ## Model details
1474
+
1475
+ - **Developed by:** Vineel Pratap et al.
1476
+ - **Model type:** Multi-Lingual Automatic Speech Recognition model
1477
+ - **Language(s):** 1000+ languages, see [supported languages](#supported-languages)
1478
+ - **License:** CC-BY-NC 4.0 license
1479
+ - **Num parameters**: 1 billion
1480
+ - **Audio sampling rate**: 16,000 kHz
1481
+ - **Cite as:**
1482
+
1483
+ @article{pratap2023mms,
1484
+ title={Scaling Speech Technology to 1,000+ Languages},
1485
+ author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli},
1486
+ journal={arXiv},
1487
+ year={2023}
1488
+ }
1489
+
1490
+ ## Additional Links
1491
+
1492
+ - [Blog post](https://ai.facebook.com/blog/multilingual-model-speech-recognition/)
1493
+ - [Transformers documentation](https://huggingface.co/docs/transformers/main/en/model_doc/mms).
1494
+ - [Paper](https://arxiv.org/abs/2305.13516)
1495
+ - [GitHub Repository](https://github.com/facebookresearch/fairseq/tree/main/examples/mms#asr)
1496
+ - [Other **MMS** checkpoints](https://huggingface.co/models?other=mms)
1497
+ - MMS base checkpoints:
1498
+ - [facebook/mms-1b](https://huggingface.co/facebook/mms-1b)
1499
+ - [facebook/mms-300m](https://huggingface.co/facebook/mms-300m)
1500
+ - [Official Space](https://huggingface.co/spaces/facebook/MMS)
adapter.eng.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4347384ac33c808dc7821bca6e4baaf1408b03b585e7caf81267b9d0ef8cca29
3
+ size 9428832
adapter.pam.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e2ed25ee9d7303a216491d49bce44f4c9a72b141d8c49a02051ef3230dd8f606
3
+ size 8824152
adapter.tgl.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7d6fd2d28c774d78a88f1e702d67930ac07002a3bfcabe8f1a952a9bc4e35655
3
+ size 9059872
config.json ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "./",
3
+ "activation_dropout": 0.05,
4
+ "adapter_attn_dim": 16,
5
+ "adapter_kernel_size": 3,
6
+ "adapter_stride": 2,
7
+ "add_adapter": false,
8
+ "apply_spec_augment": true,
9
+ "architectures": [
10
+ "Wav2Vec2ForCTC"
11
+ ],
12
+ "attention_dropout": 0.05,
13
+ "bos_token_id": 1,
14
+ "classifier_proj_size": 256,
15
+ "codevector_dim": 1024,
16
+ "contrastive_logits_temperature": 0.1,
17
+ "conv_bias": true,
18
+ "conv_dim": [
19
+ 512,
20
+ 512,
21
+ 512,
22
+ 512,
23
+ 512,
24
+ 512,
25
+ 512
26
+ ],
27
+ "conv_kernel": [
28
+ 10,
29
+ 3,
30
+ 3,
31
+ 3,
32
+ 3,
33
+ 2,
34
+ 2
35
+ ],
36
+ "conv_stride": [
37
+ 5,
38
+ 2,
39
+ 2,
40
+ 2,
41
+ 2,
42
+ 2,
43
+ 2
44
+ ],
45
+ "ctc_loss_reduction": "mean",
46
+ "ctc_zero_infinity": false,
47
+ "diversity_loss_weight": 0.1,
48
+ "do_stable_layer_norm": true,
49
+ "eos_token_id": 2,
50
+ "feat_extract_activation": "gelu",
51
+ "feat_extract_dropout": 0.0,
52
+ "feat_extract_norm": "layer",
53
+ "feat_proj_dropout": 0.05,
54
+ "feat_quantizer_dropout": 0.0,
55
+ "final_dropout": 0.05,
56
+ "hidden_act": "gelu",
57
+ "hidden_dropout": 0.05,
58
+ "hidden_size": 1280,
59
+ "initializer_range": 0.02,
60
+ "intermediate_size": 5120,
61
+ "layer_norm_eps": 1e-05,
62
+ "layerdrop": 0.05,
63
+ "mask_feature_length": 10,
64
+ "mask_feature_min_masks": 0,
65
+ "mask_feature_prob": 0.0,
66
+ "mask_time_length": 10,
67
+ "mask_time_min_masks": 2,
68
+ "mask_time_prob": 0.05,
69
+ "model_type": "wav2vec2",
70
+ "num_adapter_layers": 3,
71
+ "num_attention_heads": 16,
72
+ "num_codevector_groups": 2,
73
+ "num_codevectors_per_group": 320,
74
+ "num_conv_pos_embedding_groups": 16,
75
+ "num_conv_pos_embeddings": 128,
76
+ "num_feat_extract_layers": 7,
77
+ "num_hidden_layers": 48,
78
+ "num_negatives": 100,
79
+ "output_hidden_size": 1280,
80
+ "pad_token_id": 0,
81
+ "proj_codevector_dim": 1024,
82
+ "tdnn_dilation": [
83
+ 1,
84
+ 2,
85
+ 3,
86
+ 1,
87
+ 1
88
+ ],
89
+ "tdnn_dim": [
90
+ 512,
91
+ 512,
92
+ 512,
93
+ 512,
94
+ 1500
95
+ ],
96
+ "tdnn_kernel": [
97
+ 5,
98
+ 3,
99
+ 3,
100
+ 1,
101
+ 1
102
+ ],
103
+ "torch_dtype": "float32",
104
+ "transformers_version": "4.30.0.dev0",
105
+ "use_weighted_layer_sum": false,
106
+ "vocab_size": 154,
107
+ "xvector_output_dim": 512
108
+ }
create_adapters.py ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ import torch
3
+ from safetensors.torch import save_file as safe_save_file
4
+ from transformers.models.wav2vec2.convert_wav2vec2_original_pytorch_checkpoint_to_pytorch import load_wav2vec2_layer
5
+
6
+ langs = ["abi", "abk", "abp", "aca", "acd", "ace", "acf", "ach", "acn", "acr", "acu", "ade", "adh", "adj", "adx", "aeu", "afr", "agd", "agg", "agn", "agr", "agu", "agx", "aha", "ahk", "aia", "aka", "akb", "ake", "akp", "alj", "alp", "alt", "alz", "ame", "amf", "amh", "ami", "amk", "ann", "any", "aoz", "apb", "apr", "ara", "arl", "asa", "asg", "asm", "ast", "ata", "atb", "atg", "ati", "atq", "ava", "avn", "avu", "awa", "awb", "ayo", "ayr", "ayz", "azb", "azg", "azj-script_cyrillic", "azj-script_latin", "azz", "bak", "bam", "ban", "bao", "bas", "bav", "bba", "bbb", "bbc", "bbo", "bcc-script_arabic", "bcc-script_latin", "bcl", "bcw", "bdg", "bdh", "bdq", "bdu", "bdv", "beh", "bel", "bem", "ben", "bep", "bex", "bfa", "bfo", "bfy", "bfz", "bgc", "bgq", "bgr", "bgt", "bgw", "bha", "bht", "bhz", "bib", "bim", "bis", "biv", "bjr", "bjv", "bjw", "bjz", "bkd", "bkv", "blh", "blt", "blx", "blz", "bmq", "bmr", "bmu", "bmv", "bng", "bno", "bnp", "boa", "bod", "boj", "bom", "bor", "bos", "bov", "box", "bpr", "bps", "bqc", "bqi", "bqj", "bqp", "bre", "bru", "bsc", "bsq", "bss", "btd", "bts", "btt", "btx", "bud", "bul", "bus", "bvc", "bvz", "bwq", "bwu", "byr", "bzh", "bzi", "bzj", "caa", "cab", "cac-dialect_sanmateoixtatan", "cac-dialect_sansebastiancoatan", "cak-dialect_central", "cak-dialect_santamariadejesus", "cak-dialect_santodomingoxenacoj", "cak-dialect_southcentral", "cak-dialect_western", "cak-dialect_yepocapa", "cap", "car", "cas", "cat", "cax", "cbc", "cbi", "cbr", "cbs", "cbt", "cbu", "cbv", "cce", "cco", "cdj", "ceb", "ceg", "cek", "ces", "cfm", "cgc", "che", "chf", "chv", "chz", "cjo", "cjp", "cjs", "ckb", "cko", "ckt", "cla", "cle", "cly", "cme", "cmn-script_simplified", "cmo-script_khmer", "cmo-script_latin", "cmr", "cnh", "cni", "cnl", "cnt", "coe", "cof", "cok", "con", "cot", "cou", "cpa", "cpb", "cpu", "crh", "crk-script_latin", "crk-script_syllabics", "crn", "crq", "crs", "crt", "csk", "cso", "ctd", "ctg", "cto", "ctu", "cuc", "cui", "cuk", "cul", "cwa", "cwe", "cwt", "cya", "cym", "daa", "dah", "dan", "dar", "dbj", "dbq", "ddn", "ded", "des", "deu", "dga", "dgi", "dgk", "dgo", "dgr", "dhi", "did", "dig", "dik", "dip", "div", "djk", "dnj-dialect_blowowest", "dnj-dialect_gweetaawueast", "dnt", "dnw", "dop", "dos", "dsh", "dso", "dtp", "dts", "dug", "dwr", "dyi", "dyo", "dyu", "dzo", "eip", "eka", "ell", "emp", "enb", "eng", "enx", "epo", "ese", "ess", "est", "eus", "evn", "ewe", "eza", "fal", "fao", "far", "fas", "fij", "fin", "flr", "fmu", "fon", "fra", "frd", "fry", "ful", "gag-script_cyrillic", "gag-script_latin", "gai", "gam", "gau", "gbi", "gbk", "gbm", "gbo", "gde", "geb", "gej", "gil", "gjn", "gkn", "gld", "gle", "glg", "glk", "gmv", "gna", "gnd", "gng", "gof-script_latin", "gog", "gor", "gqr", "grc", "gri", "grn", "grt", "gso", "gub", "guc", "gud", "guh", "guj", "guk", "gum", "guo", "guq", "guu", "gux", "gvc", "gvl", "gwi", "gwr", "gym", "gyr", "had", "hag", "hak", "hap", "hat", "hau", "hay", "heb", "heh", "hif", "hig", "hil", "hin", "hlb", "hlt", "hne", "hnn", "hns", "hoc", "hoy", "hrv", "hsb", "hto", "hub", "hui", "hun", "hus-dialect_centralveracruz", "hus-dialect_westernpotosino", "huu", "huv", "hvn", "hwc", "hye", "hyw", "iba", "ibo", "icr", "idd", "ifa", "ifb", "ife", "ifk", "ifu", "ify", "ign", "ikk", "ilb", "ilo", "imo", "ina", "inb", "ind", "iou", "ipi", "iqw", "iri", "irk", "isl", "ita", "itl", "itv", "ixl-dialect_sangasparchajul", "ixl-dialect_sanjuancotzal", "ixl-dialect_santamarianebaj", "izr", "izz", "jac", "jam", "jav", "jbu", "jen", "jic", "jiv", "jmc", "jmd", "jpn", "jun", "juy", "jvn", "kaa", "kab", "kac", "kak", "kam", "kan", "kao", "kaq", "kat", "kay", "kaz", "kbo", "kbp", "kbq", "kbr", "kby", "kca", "kcg", "kdc", "kde", "kdh", "kdi", "kdj", "kdl", "kdn", "kdt", "kea", "kek", "ken", "keo", "ker", "key", "kez", "kfb", "kff-script_telugu", "kfw", "kfx", "khg", "khm", "khq", "kia", "kij", "kik", "kin", "kir", "kjb", "kje", "kjg", "kjh", "kki", "kkj", "kle", "klu", "klv", "klw", "kma", "kmd", "kml", "kmr-script_arabic", "kmr-script_cyrillic", "kmr-script_latin", "kmu", "knb", "kne", "knf", "knj", "knk", "kno", "kog", "kor", "kpq", "kps", "kpv", "kpy", "kpz", "kqe", "kqp", "kqr", "kqy", "krc", "kri", "krj", "krl", "krr", "krs", "kru", "ksb", "ksr", "kss", "ktb", "ktj", "kub", "kue", "kum", "kus", "kvn", "kvw", "kwd", "kwf", "kwi", "kxc", "kxf", "kxm", "kxv", "kyb", "kyc", "kyf", "kyg", "kyo", "kyq", "kyu", "kyz", "kzf", "lac", "laj", "lam", "lao", "las", "lat", "lav", "law", "lbj", "lbw", "lcp", "lee", "lef", "lem", "lew", "lex", "lgg", "lgl", "lhu", "lia", "lid", "lif", "lin", "lip", "lis", "lit", "lje", "ljp", "llg", "lln", "lme", "lnd", "lns", "lob", "lok", "lom", "lon", "loq", "lsi", "lsm", "ltz", "luc", "lug", "luo", "lwo", "lww", "lzz", "maa-dialect_sanantonio", "maa-dialect_sanjeronimo", "mad", "mag", "mah", "mai", "maj", "mak", "mal", "mam-dialect_central", "mam-dialect_northern", "mam-dialect_southern", "mam-dialect_western", "maq", "mar", "maw", "maz", "mbb", "mbc", "mbh", "mbj", "mbt", "mbu", "mbz", "mca", "mcb", "mcd", "mco", "mcp", "mcq", "mcu", "mda", "mdf", "mdv", "mdy", "med", "mee", "mej", "men", "meq", "met", "mev", "mfe", "mfh", "mfi", "mfk", "mfq", "mfy", "mfz", "mgd", "mge", "mgh", "mgo", "mhi", "mhr", "mhu", "mhx", "mhy", "mib", "mie", "mif", "mih", "mil", "mim", "min", "mio", "mip", "miq", "mit", "miy", "miz", "mjl", "mjv", "mkd", "mkl", "mkn", "mlg", "mlt", "mmg", "mnb", "mnf", "mnk", "mnw", "mnx", "moa", "mog", "mon", "mop", "mor", "mos", "mox", "moz", "mpg", "mpm", "mpp", "mpx", "mqb", "mqf", "mqj", "mqn", "mri", "mrw", "msy", "mtd", "mtj", "mto", "muh", "mup", "mur", "muv", "muy", "mvp", "mwq", "mwv", "mxb", "mxq", "mxt", "mxv", "mya", "myb", "myk", "myl", "myv", "myx", "myy", "mza", "mzi", "mzj", "mzk", "mzm", "mzw", "nab", "nag", "nan", "nas", "naw", "nca", "nch", "ncj", "ncl", "ncu", "ndj", "ndp", "ndv", "ndy", "ndz", "neb", "new", "nfa", "nfr", "nga", "ngl", "ngp", "ngu", "nhe", "nhi", "nhu", "nhw", "nhx", "nhy", "nia", "nij", "nim", "nin", "nko", "nlc", "nld", "nlg", "nlk", "nmz", "nnb", "nno", "nnq", "nnw", "noa", "nob", "nod", "nog", "not", "npi", "npl", "npy", "nso", "nst", "nsu", "ntm", "ntr", "nuj", "nus", "nuz", "nwb", "nxq", "nya", "nyf", "nyn", "nyo", "nyy", "nzi", "obo", "oci", "ojb-script_latin", "ojb-script_syllabics", "oku", "old", "omw", "onb", "ood", "orm", "ory", "oss", "ote", "otq", "ozm", "pab", "pad", "pag", "pam", "pan", "pao", "pap", "pau", "pbb", "pbc", "pbi", "pce", "pcm", "peg", "pez", "pib", "pil", "pir", "pis", "pjt", "pkb", "pls", "plw", "pmf", "pny", "poh-dialect_eastern", "poh-dialect_western", "poi", "pol", "por", "poy", "ppk", "pps", "prf", "prk", "prt", "pse", "pss", "ptu", "pui", "pus", "pwg", "pww", "pxm", "qub", "quc-dialect_central", "quc-dialect_east", "quc-dialect_north", "quf", "quh", "qul", "quw", "quy", "quz", "qvc", "qve", "qvh", "qvm", "qvn", "qvo", "qvs", "qvw", "qvz", "qwh", "qxh", "qxl", "qxn", "qxo", "qxr", "rah", "rai", "rap", "rav", "raw", "rej", "rel", "rgu", "rhg", "rif-script_arabic", "rif-script_latin", "ril", "rim", "rjs", "rkt", "rmc-script_cyrillic", "rmc-script_latin", "rmo", "rmy-script_cyrillic", "rmy-script_latin", "rng", "rnl", "roh-dialect_sursilv", "roh-dialect_vallader", "rol", "ron", "rop", "rro", "rub", "ruf", "rug", "run", "rus", "sab", "sag", "sah", "saj", "saq", "sas", "sat", "sba", "sbd", "sbl", "sbp", "sch", "sck", "sda", "sea", "seh", "ses", "sey", "sgb", "sgj", "sgw", "shi", "shk", "shn", "sho", "shp", "sid", "sig", "sil", "sja", "sjm", "sld", "slk", "slu", "slv", "sml", "smo", "sna", "snd", "sne", "snn", "snp", "snw", "som", "soy", "spa", "spp", "spy", "sqi", "sri", "srm", "srn", "srp-script_cyrillic", "srp-script_latin", "srx", "stn", "stp", "suc", "suk", "sun", "sur", "sus", "suv", "suz", "swe", "swh", "sxb", "sxn", "sya", "syl", "sza", "tac", "taj", "tam", "tao", "tap", "taq", "tat", "tav", "tbc", "tbg", "tbk", "tbl", "tby", "tbz", "tca", "tcc", "tcs", "tcz", "tdj", "ted", "tee", "tel", "tem", "teo", "ter", "tes", "tew", "tex", "tfr", "tgj", "tgk", "tgl", "tgo", "tgp", "tha", "thk", "thl", "tih", "tik", "tir", "tkr", "tlb", "tlj", "tly", "tmc", "tmf", "tna", "tng", "tnk", "tnn", "tnp", "tnr", "tnt", "tob", "toc", "toh", "tom", "tos", "tpi", "tpm", "tpp", "tpt", "trc", "tri", "trn", "trs", "tso", "tsz", "ttc", "tte", "ttq-script_tifinagh", "tue", "tuf", "tuk-script_arabic", "tuk-script_latin", "tuo", "tur", "tvw", "twb", "twe", "twu", "txa", "txq", "txu", "tye", "tzh-dialect_bachajon", "tzh-dialect_tenejapa", "tzj-dialect_eastern", "tzj-dialect_western", "tzo-dialect_chamula", "tzo-dialect_chenalho", "ubl", "ubu", "udm", "udu", "uig-script_arabic", "uig-script_cyrillic", "ukr", "umb", "unr", "upv", "ura", "urb", "urd-script_arabic", "urd-script_devanagari", "urd-script_latin", "urk", "urt", "ury", "usp", "uzb-script_cyrillic", "uzb-script_latin", "vag", "vid", "vie", "vif", "vmw", "vmy", "vot", "vun", "vut", "wal-script_ethiopic", "wal-script_latin", "wap", "war", "waw", "way", "wba", "wlo", "wlx", "wmw", "wob", "wol", "wsg", "wwa", "xal", "xdy", "xed", "xer", "xho", "xmm", "xnj", "xnr", "xog", "xon", "xrb", "xsb", "xsm", "xsr", "xsu", "xta", "xtd", "xte", "xtm", "xtn", "xua", "xuo", "yaa", "yad", "yal", "yam", "yao", "yas", "yat", "yaz", "yba", "ybb", "ycl", "ycn", "yea", "yka", "yli", "yor", "yre", "yua", "yue-script_traditional", "yuz", "yva", "zaa", "zab", "zac", "zad", "zae", "zai", "zam", "zao", "zaq", "zar", "zas", "zav", "zaw", "zca", "zga", "zim", "ziw", "zlm", "zmz", "zne", "zos", "zpc", "zpg", "zpi", "zpl", "zpm", "zpo", "zpt", "zpu", "zpz", "ztq", "zty", "zul", "zyb", "zyp", "zza"]
7
+
8
+ sd = torch.load("../mms1b_all.pt")
9
+
10
+ for lang in langs:
11
+ hf_dict = {}
12
+ fsq_adapters = sd["adapter"][lang]["model"]
13
+
14
+ for k, v in fsq_adapters.items():
15
+ renamed_adapters = load_wav2vec2_layer(k, v, hf_dict=hf_dict)
16
+
17
+ torch.save(hf_dict, f"./adapter.{lang}.bin")
18
+ safe_save_file(hf_dict, f"./adapter.{lang}.safetensors", metadata={"format": "pt"})
create_vocab.py ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ import os
3
+ import json
4
+ folder_path = "./vocabs"
5
+
6
+ all_dict = {}
7
+
8
+ def parse_file(filename):
9
+ dictionary = {
10
+ "</s>": 2,
11
+ "<pad>": 0,
12
+ "<s>": 1,
13
+ "<unk>": 3,
14
+ }
15
+ value = 4
16
+
17
+ with open(filename, 'r') as file:
18
+ for line in file:
19
+ line = line.strip().split()
20
+ if line:
21
+ key = line[0]
22
+ dictionary[key] = value
23
+ value += 1
24
+
25
+ return dictionary
26
+
27
+ for filename in os.listdir(folder_path):
28
+ filepath = os.path.join(folder_path, filename)
29
+ lang = filename.split(".")[0]
30
+ if os.path.isfile(filepath):
31
+ all_dict[lang] = parse_file(filepath)
32
+
33
+
34
+ output_path = "vocab.json" # Replace "output.json" with the desired output file path
35
+
36
+ with open(output_path, 'w') as output_file:
37
+ json.dump(all_dict, output_file, indent=4, sort_keys=True)
download_vocabs.sh ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ langs=(abi abk abp aca acd ace acf ach acn acr acu ade adh adj adx aeu afr agd agg agn agr agu agx aha ahk aia aka akb ake akp alj alp alt alz ame amf amh ami amk ann any aoz apb apr ara arl asa asg asm ast ata atb atg ati atq ava avn avu awa awb ayo ayr ayz azb azg azj-script_cyrillic azj-script_latin azz bak bam ban bao bas bav bba bbb bbc bbo bcc-script_arabic bcc-script_latin bcl bcw bdg bdh bdq bdu bdv beh bel bem ben bep bex bfa bfo bfy bfz bgc bgq bgr bgt bgw bha bht bhz bib bim bis biv bjr bjv bjw bjz bkd bkv blh blt blx blz bmq bmr bmu bmv bng bno bnp boa bod boj bom bor bos bov box bpr bps bqc bqi bqj bqp bre bru bsc bsq bss btd bts btt btx bud bul bus bvc bvz bwq bwu byr bzh bzi bzj caa cab cac-dialect_sanmateoixtatan cac-dialect_sansebastiancoatan cak-dialect_central cak-dialect_santamariadejesus cak-dialect_santodomingoxenacoj cak-dialect_southcentral cak-dialect_western cak-dialect_yepocapa cap car cas cat cax cbc cbi cbr cbs cbt cbu cbv cce cco cdj ceb ceg cek ces cfm cgc che chf chv chz cjo cjp cjs ckb cko ckt cla cle cly cme cmn-script_simplified cmo-script_khmer cmo-script_latin cmr cnh cni cnl cnt coe cof cok con cot cou cpa cpb cpu crh crk-script_latin crk-script_syllabics crn crq crs crt csk cso ctd ctg cto ctu cuc cui cuk cul cwa cwe cwt cya cym daa dah dan dar dbj dbq ddn ded des deu dga dgi dgk dgo dgr dhi did dig dik dip div djk dnj-dialect_blowowest dnj-dialect_gweetaawueast dnt dnw dop dos dsh dso dtp dts dug dwr dyi dyo dyu dzo eip eka ell emp enb eng enx epo ese ess est eus evn ewe eza fal fao far fas fij fin flr fmu fon fra frd fry ful gag-script_cyrillic gag-script_latin gai gam gau gbi gbk gbm gbo gde geb gej gil gjn gkn gld gle glg glk gmv gna gnd gng gof-script_latin gog gor gqr grc gri grn grt gso gub guc gud guh guj guk gum guo guq guu gux gvc gvl gwi gwr gym gyr had hag hak hap hat hau hay heb heh hif hig hil hin hlb hlt hne hnn hns hoc hoy hrv hsb hto hub hui hun hus-dialect_centralveracruz hus-dialect_westernpotosino huu huv hvn hwc hye hyw iba ibo icr idd ifa ifb ife ifk ifu ify ign ikk ilb ilo imo ina inb ind iou ipi iqw iri irk isl ita itl itv ixl-dialect_sangasparchajul ixl-dialect_sanjuancotzal ixl-dialect_santamarianebaj izr izz jac jam jav jbu jen jic jiv jmc jmd jpn jun juy jvn kaa kab kac kak kam kan kao kaq kat kay kaz kbo kbp kbq kbr kby kca kcg kdc kde kdh kdi kdj kdl kdn kdt kea kek ken keo ker key kez kfb kff-script_telugu kfw kfx khg khm khq kia kij kik kin kir kjb kje kjg kjh kki kkj kle klu klv klw kma kmd kml kmr-script_arabic kmr-script_cyrillic kmr-script_latin kmu knb kne knf knj knk kno kog kor kpq kps kpv kpy kpz kqe kqp kqr kqy krc kri krj krl krr krs kru ksb ksr kss ktb ktj kub kue kum kus kvn kvw kwd kwf kwi kxc kxf kxm kxv kyb kyc kyf kyg kyo kyq kyu kyz kzf lac laj lam lao las lat lav law lbj lbw lcp lee lef lem lew lex lgg lgl lhu lia lid lif lin lip lis lit lje ljp llg lln lme lnd lns lob lok lom lon loq lsi lsm ltz luc lug luo lwo lww lzz maa-dialect_sanantonio maa-dialect_sanjeronimo mad mag mah mai maj mak mal mam-dialect_central mam-dialect_northern mam-dialect_southern mam-dialect_western maq mar maw maz mbb mbc mbh mbj mbt mbu mbz mca mcb mcd mco mcp mcq mcu mda mdf mdv mdy med mee mej men meq met mev mfe mfh mfi mfk mfq mfy mfz mgd mge mgh mgo mhi mhr mhu mhx mhy mib mie mif mih mil mim min mio mip miq mit miy miz mjl mjv mkd mkl mkn mlg mlt mmg mnb mnf mnk mnw mnx moa mog mon mop mor mos mox moz mpg mpm mpp mpx mqb mqf mqj mqn mri mrw msy mtd mtj mto muh mup mur muv muy mvp mwq mwv mxb mxq mxt mxv mya myb myk myl myv myx myy mza mzi mzj mzk mzm mzw nab nag nan nas naw nca nch ncj ncl ncu ndj ndp ndv ndy ndz neb new nfa nfr nga ngl ngp ngu nhe nhi nhu nhw nhx nhy nia nij nim nin nko nlc nld nlg nlk nmz nnb nno nnq nnw noa nob nod nog not npi npl npy nso nst nsu ntm ntr nuj nus nuz nwb nxq nya nyf nyn nyo nyy nzi obo oci ojb-script_latin ojb-script_syllabics oku old omw onb ood orm ory oss ote otq ozm pab pad pag pam pan pao pap pau pbb pbc pbi pce pcm peg pez pib pil pir pis pjt pkb pls plw pmf pny poh-dialect_eastern poh-dialect_western poi pol por poy ppk pps prf prk prt pse pss ptu pui pus pwg pww pxm qub quc-dialect_central quc-dialect_east quc-dialect_north quf quh qul quw quy quz qvc qve qvh qvm qvn qvo qvs qvw qvz qwh qxh qxl qxn qxo qxr rah rai rap rav raw rej rel rgu rhg rif-script_arabic rif-script_latin ril rim rjs rkt rmc-script_cyrillic rmc-script_latin rmo rmy-script_cyrillic rmy-script_latin rng rnl roh-dialect_sursilv roh-dialect_vallader rol ron rop rro rub ruf rug run rus sab sag sah saj saq sas sat sba sbd sbl sbp sch sck sda sea seh ses sey sgb sgj sgw shi shk shn sho shp sid sig sil sja sjm sld slk slu slv sml smo sna snd sne snn snp snw som soy spa spp spy sqi sri srm srn srp-script_cyrillic srp-script_latin srx stn stp suc suk sun sur sus suv suz swe swh sxb sxn sya syl sza tac taj tam tao tap taq tat tav tbc tbg tbk tbl tby tbz tca tcc tcs tcz tdj ted tee tel tem teo ter tes tew tex tfr tgj tgk tgl tgo tgp tha thk thl tih tik tir tkr tlb tlj tly tmc tmf tna tng tnk tnn tnp tnr tnt tob toc toh tom tos tpi tpm tpp tpt trc tri trn trs tso tsz ttc tte ttq-script_tifinagh tue tuf tuk-script_arabic tuk-script_latin tuo tur tvw twb twe twu txa txq txu tye tzh-dialect_bachajon tzh-dialect_tenejapa tzj-dialect_eastern tzj-dialect_western tzo-dialect_chamula tzo-dialect_chenalho ubl ubu udm udu uig-script_arabic uig-script_cyrillic ukr umb unr upv ura urb urd-script_arabic urd-script_devanagari urd-script_latin urk urt ury usp uzb-script_cyrillic uzb-script_latin vag vid vie vif vmw vmy vot vun vut wal-script_ethiopic wal-script_latin wap war waw way wba wlo wlx wmw wob wol wsg wwa xal xdy xed xer xho xmm xnj xnr xog xon xrb xsb xsm xsr xsu xta xtd xte xtm xtn xua xuo yaa yad yal yam yao yas yat yaz yba ybb ycl ycn yea yka yli yor yre yua yue-script_traditional yuz yva zaa zab zac zad zae zai zam zao zaq zar zas zav zaw zca zga zim ziw zlm zmz zne zos zpc zpg zpi zpl zpm zpo zpt zpu zpz ztq zty zul zyb zyp zza)
3
+ link_format="https://dl.fbaipublicfiles.com/mms/asr/dict/mms1b_all/"
4
+ mkdir -p vocabs
5
+ cd vocabs
6
+ for lang in ${langs[@]}; do
7
+ link="${link_format}${lang}.txt"
8
+ echo "wget ${link}"
9
+ wget ${link}
10
+ done
gitattributes ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tflite filter=lfs diff=lfs merge=lfs -text
29
+ *.tgz filter=lfs diff=lfs merge=lfs -text
30
+ *.wasm filter=lfs diff=lfs merge=lfs -text
31
+ *.xz filter=lfs diff=lfs merge=lfs -text
32
+ *.zip filter=lfs diff=lfs merge=lfs -text
33
+ *.zst filter=lfs diff=lfs merge=lfs -text
34
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0f1d95ce43d27e03d5d8dd56c697c805460f967c793dd2cbec2e8e8012deda98
3
+ size 3859521128
preprocessor_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_normalize": true,
3
+ "feature_extractor_type": "Wav2Vec2FeatureExtractor",
4
+ "feature_size": 1,
5
+ "padding_side": "right",
6
+ "padding_value": 0,
7
+ "processor_class": "Wav2Vec2Processor",
8
+ "return_attention_mask": true,
9
+ "sampling_rate": 16000
10
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "eos_token": "</s>",
4
+ "pad_token": "<pad>",
5
+ "unk_token": "<unk>"
6
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "clean_up_tokenization_spaces": true,
4
+ "do_lower_case": false,
5
+ "eos_token": "</s>",
6
+ "model_max_length": 1000000000000000019884624838656,
7
+ "pad_token": "<pad>",
8
+ "processor_class": "Wav2Vec2Processor",
9
+ "replace_word_delimiter_char": " ",
10
+ "target_lang": "eng",
11
+ "tokenizer_class": "Wav2Vec2CTCTokenizer",
12
+ "unk_token": "<unk>",
13
+ "word_delimiter_token": "|"
14
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff