Porjaz commited on
Commit
3c0ff3e
·
verified ·
1 Parent(s): 81581ab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -1
README.md CHANGED
@@ -18,6 +18,8 @@ This collaboration is part of the activities of the Center for Advanced Interdis
18
 
19
  ## Data used for training
20
 
 
 
21
  In training of the model, we used the following data sources:
22
  1. Digital Archive for Ethnological and Anthropological Resources (DAEAR) at the Institutе of Ethnology and Anthropology, PMF, UKIM.
23
  2. Audio version of the international journal "EthnoAnthropoZoom" at the Institutе of Ethnology and Anthropology, PMF, UKIM.
@@ -26,8 +28,23 @@ In training of the model, we used the following data sources:
26
  5. Macedonian version of the Mozilla Common Voice (version 18).
27
 
28
 
 
 
 
 
29
  ## Usage
30
 
 
 
 
 
 
 
 
 
 
 
 
31
  ```python
32
  from speechbrain.inference.interfaces import foreign_class
33
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
@@ -35,4 +52,13 @@ asr_classifier = foreign_class(source="Macedonian-ASR/whisper-large-v3-macedonia
35
  asr_classifier = asr_classifier.to(device)
36
  predictions = asr_classifier.classify_file("audio_file.wav", device)
37
  print(predictions)
38
- ```
 
 
 
 
 
 
 
 
 
 
18
 
19
  ## Data used for training
20
 
21
+ The model is trained on around 60 hours of Macedonian speech.
22
+
23
  In training of the model, we used the following data sources:
24
  1. Digital Archive for Ethnological and Anthropological Resources (DAEAR) at the Institutе of Ethnology and Anthropology, PMF, UKIM.
25
  2. Audio version of the international journal "EthnoAnthropoZoom" at the Institutе of Ethnology and Anthropology, PMF, UKIM.
 
28
  5. Macedonian version of the Mozilla Common Voice (version 18).
29
 
30
 
31
+ ## Model description
32
+ This model is a fine-tuned version of the large Whisper-v3 model. During fine-tuning, the encoder was kept frozen and only the decoder was optimized.
33
+
34
+
35
  ## Usage
36
 
37
+ The model is developed using the [SpeechBrain](https://speechbrain.github.io) toolkit. To use it, you need to install SpeechBrain with:
38
+ ```
39
+ pip install speechbrain
40
+ ```
41
+ SpeechBrain relies on the Transformers library, therefore you need install it:
42
+ ```
43
+ pip install transformers
44
+ ```
45
+
46
+ An external `py_module_file=custom_interface.py` is used as an external Predictor class into this HF repos. We use the `foreign_class` function from `speechbrain.pretrained.interfaces` that allows you to load your custom model.
47
+
48
  ```python
49
  from speechbrain.inference.interfaces import foreign_class
50
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 
52
  asr_classifier = asr_classifier.to(device)
53
  predictions = asr_classifier.classify_file("audio_file.wav", device)
54
  print(predictions)
55
+ ```
56
+
57
+ ## Training
58
+
59
+ To fine-tune this model, you need to run:
60
+ ```
61
+ python train.py hyperparams.yaml
62
+ ```
63
+
64
+ ```train.py``` file contains the functions necessary for training the model and ```hyperparams.yaml``` contains the hyperparameters. For more details about training the model, refer to the [SpeechBrain](https://speechbrain.github.io) documentation.