Update README.md
Browse files
README.md
CHANGED
@@ -17,8 +17,12 @@ This model predicts prosodic units on speech.
|
|
17 |
For each 20ms frame the model predicts 1 or 0, indicating whether there is a prosodic unit in
|
18 |
this frame or not.
|
19 |
|
|
|
|
|
20 |
|
21 |
-
|
|
|
|
|
22 |
|
23 |
|
24 |
|
@@ -31,7 +35,7 @@ this frame or not.
|
|
31 |
|
32 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
33 |
|
34 |
-
- **Developed by:** Peter Rupnik, Nikola
|
35 |
- **Funded by:** MEZZANINE project
|
36 |
- **Model type:** Wav2Vec2Bert for Audio Frame Classification
|
37 |
- **Language(s) (NLP):** Trained and tested on Slovenian, ATM unclear if usable cross-lingually
|
@@ -259,7 +263,6 @@ final_intervals = merge_events(ds["prosodic_units"], ds["chunk_centroid_s"])
|
|
259 |
print(final_intervals)
|
260 |
# Outputs: [[3.14, 4.96], [5.6, 8.4], [8.62, 9.32], [10.12, 10.7], [11.72, 13.1],....
|
261 |
```
|
262 |
-
## Bias, Risks, and Limitations
|
263 |
|
264 |
## Training Details
|
265 |
|
|
|
17 |
For each 20ms frame the model predicts 1 or 0, indicating whether there is a prosodic unit in
|
18 |
this frame or not.
|
19 |
|
20 |
+
This frame-level output can be grouped into events with the frames_to_intervals function provided in the
|
21 |
+
code snippets below.
|
22 |
|
23 |
+
It is known that the model is unreliable if the audio starts or ends within a prosodic unit. This can be somewhat
|
24 |
+
circumvented by 1) using the largest possible chunks that will fit your machine and 2) use overlapping chunks
|
25 |
+
and combining results smartly.
|
26 |
|
27 |
|
28 |
|
|
|
35 |
|
36 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
37 |
|
38 |
+
- **Developed by:** Peter Rupnik, Nikola Ljubešić, Darinka Verdonik, Simona Majheničy
|
39 |
- **Funded by:** MEZZANINE project
|
40 |
- **Model type:** Wav2Vec2Bert for Audio Frame Classification
|
41 |
- **Language(s) (NLP):** Trained and tested on Slovenian, ATM unclear if usable cross-lingually
|
|
|
263 |
print(final_intervals)
|
264 |
# Outputs: [[3.14, 4.96], [5.6, 8.4], [8.62, 9.32], [10.12, 10.7], [11.72, 13.1],....
|
265 |
```
|
|
|
266 |
|
267 |
## Training Details
|
268 |
|