Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
tags:
|
| 5 |
+
- Voice
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
### Model Description
|
| 9 |
+
Model utilizes Wav2vec2 architecture trained on the Superb dataset for keyword spotting task and was fine
|
| 10 |
+
tuned to identify dental dental click utterance (https://en.wikipedia.org/wiki/Dental_click) in speech.
|
| 11 |
+
Model was trained for 10 epochs on a limited quantity of speech (~1.5 hours) and with only one speaker.
|
| 12 |
+
Thus the model has not be checked for generalizability to other speakers or languages without further
|
| 13 |
+
training data.
|
| 14 |
+
|
| 15 |
+
Model was evaluated for accuracy on a hold out test set of 20% of the available data and scored 97%.
|
| 16 |
+
|
| 17 |
+
## Uses
|
| 18 |
+
Model can be used via transformers library or via Hugging Face Hosted inference API to the right. I would
|
| 19 |
+
caution against the use of the 'Record from browser' option as model may erronously identify user's mouse
|
| 20 |
+
click as a speech utterance. Audio files for upload should be 1 sec in length, with 'WAV' format and 16 bit
|
| 21 |
+
signed integer PCM encoding.
|