hexgrad
/

Kokoro-82M

Model card Files Files and versions Community

hexgrad commited on 18 days ago

Commit

aab4d9f

·

verified ·

1 Parent(s): aa89b69

Upload README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -37,6 +37,10 @@ Voices are listed in [VOICES.md](https://huggingface.co/hexgrad/Kokoro-82M/blob/
 Support for non-English languages may be absent or thin due to weak G2P and/or lack of training data. Some languages are only represented by a small handful or even just one voice (French).
 ### Usage
 The following can be run in a single cell on [Google Colab](https://colab.research.google.com/).

 Support for non-English languages may be absent or thin due to weak G2P and/or lack of training data. Some languages are only represented by a small handful or even just one voice (French).
+Most voices perform best on a "goldilocks range" of say, 50-150 tokens out of ~500 possible. Voices may perform worse at the extremes:
+- **Weakness** on short utterances (especially less than 10-20 tokens). Root cause could be lack of short-utterance training data and/or model architecture. One possible inference mitigation is to bundle shorter utterances together.
+- **Rushing** on long utterances (over 400 tokens). You can chunk down to shorter utterances or can adjust the `speed` parameter to mitigate this.
 ### Usage
 The following can be run in a single cell on [Google Colab](https://colab.research.google.com/).