README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
language:
|
3 |
datasets:
|
4 |
- common_voice
|
5 |
metrics:
|
@@ -11,23 +11,23 @@ tags:
|
|
11 |
- xlsr-fine-tuning-week
|
12 |
license: apache-2.0
|
13 |
model-index:
|
14 |
-
- name:
|
15 |
results:
|
16 |
- task:
|
17 |
name: Speech Recognition
|
18 |
type: automatic-speech-recognition
|
19 |
dataset:
|
20 |
-
name: Common Voice
|
21 |
type: common_voice
|
22 |
-
args:
|
23 |
metrics:
|
24 |
- name: Test WER
|
25 |
type: wer
|
26 |
-
value:
|
27 |
---
|
28 |
|
29 |
-
# Wav2Vec2-Large-XLSR-53-
|
30 |
-
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on
|
31 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
32 |
|
33 |
## Usage
|
@@ -40,10 +40,10 @@ import torchaudio
|
|
40 |
from datasets import load_dataset
|
41 |
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
42 |
|
43 |
-
test_dataset = load_dataset("common_voice", "
|
44 |
|
45 |
-
processor = Wav2Vec2Processor.from_pretrained("
|
46 |
-
model = Wav2Vec2ForCTC.from_pretrained("
|
47 |
|
48 |
resampler = torchaudio.transforms.Resample(48_000, 16_000)
|
49 |
|
@@ -69,7 +69,7 @@ print("Reference:", test_dataset["sentence"][:2])
|
|
69 |
|
70 |
## Evaluation
|
71 |
|
72 |
-
The model can be evaluated as follows on the
|
73 |
|
74 |
|
75 |
```python
|
@@ -79,13 +79,13 @@ from datasets import load_dataset, load_metric
|
|
79 |
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
80 |
import re
|
81 |
|
82 |
-
test_dataset = load_dataset("common_voice", "
|
83 |
|
84 |
wer = load_metric("wer")
|
85 |
|
86 |
-
processor = Wav2Vec2Processor.from_pretrained("
|
87 |
|
88 |
-
model = Wav2Vec2ForCTC.from_pretrained("
|
89 |
|
90 |
model.to("cuda")
|
91 |
|
|
|
1 |
---
|
2 |
+
language: pa-IN
|
3 |
datasets:
|
4 |
- common_voice
|
5 |
metrics:
|
|
|
11 |
- xlsr-fine-tuning-week
|
12 |
license: apache-2.0
|
13 |
model-index:
|
14 |
+
- name: danurahul/wav2vec2-large-xlsr-pa-IN
|
15 |
results:
|
16 |
- task:
|
17 |
name: Speech Recognition
|
18 |
type: automatic-speech-recognition
|
19 |
dataset:
|
20 |
+
name: Common Voice pa-IN
|
21 |
type: common_voice
|
22 |
+
args: pa-IN
|
23 |
metrics:
|
24 |
- name: Test WER
|
25 |
type: wer
|
26 |
+
value: wer_result_on_test #TODO (IMPORTANT): replace {wer_result_on_test} with the WER error rate you achieved on the common_voice test set. It should be in the format XX.XX (don't add the % sign here). **Please** remember to fill out this value after you evaluated your model, so that your model appears on the leaderboard. If you fill out this model card before evaluating your model, please remember to edit the model card afterward to fill in your value
|
27 |
---
|
28 |
|
29 |
+
# Wav2Vec2-Large-XLSR-53-Punjabi
|
30 |
+
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Punjabi using the [Common Voice](https://huggingface.co/datasets/common_voice).
|
31 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
32 |
|
33 |
## Usage
|
|
|
40 |
from datasets import load_dataset
|
41 |
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
42 |
|
43 |
+
test_dataset = load_dataset("common_voice", "pa-IN", split="test[:2%]")
|
44 |
|
45 |
+
processor = Wav2Vec2Processor.from_pretrained("danurahul/wav2vec2-large-xlsr-pa-IN")
|
46 |
+
model = Wav2Vec2ForCTC.from_pretrained("danurahul/wav2vec2-large-xlsr-pa-IN")
|
47 |
|
48 |
resampler = torchaudio.transforms.Resample(48_000, 16_000)
|
49 |
|
|
|
69 |
|
70 |
## Evaluation
|
71 |
|
72 |
+
The model can be evaluated as follows on the Punjabi test data of Common Voice.
|
73 |
|
74 |
|
75 |
```python
|
|
|
79 |
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
80 |
import re
|
81 |
|
82 |
+
test_dataset = load_dataset("common_voice", "pa-IN", split="test")
|
83 |
|
84 |
wer = load_metric("wer")
|
85 |
|
86 |
+
processor = Wav2Vec2Processor.from_pretrained("danurahul/wav2vec2-large-xlsr-pa-IN")
|
87 |
|
88 |
+
model = Wav2Vec2ForCTC.from_pretrained("danurahul/wav2vec2-large-xlsr-pa-IN")
|
89 |
|
90 |
model.to("cuda")
|
91 |
|