OpenF5-Intermediate / README.md
mrfakename's picture
Update README.md
ea89fef verified
|
raw
history blame
5.73 kB
metadata
license: cc-by-4.0
extra_gated_prompt: >-
  You agree to abide by the terms of the CC-BY 4.0 license and provide accurate
  information about your intended use of this model.
extra_gated_fields:
  Full Name: text
  Organization (if applicable): text
  Country: country
  IP Location: ip_location
  Intended Use Case:
    type: select
    options:
      - Research
      - Education
      - Accessibility (e.g., assistive technology)
      - Creative Projects (e.g., audiobooks, podcasts)
      - Commercial
      - label: Other (please specify)
        value: other
  Please describe how you intend to use this model: text
  Do you plan to use this model for commercial purposes?:
    type: select
    options:
      - 'Yes'
      - 'No'
  I agree to comply with the terms of the CC-BY license when using this model: checkbox
  I confirm that the information provided above is complete and accurate: checkbox

⚠️ WORK IN PROGRESS: This model is still in early training stages. Current checkpoints produce low-quality, garbled speech. Updated checkpoints will be released as training progresses.

OpenF5-TTS

A commercial-friendly version of F5-TTS retrained from scratch on permissively-licensed data.

Trained on the Emilia-YODAS (CC-BY) dataset using the F5-TTS Small configuration.

GitHub Repository + Details

Usage

The model requires specific configuration files to work properly:

pip install f5-tts
huggingface-cli download mrfakename/OpenF5-Intermediate --local-dir openf5
f5-tts_infer-cli -mc openf5/model_config.yaml -p openf5/model_last.pt -v openf5/vocab.txt

Training Progress

Listen to various audio samples across steps. All samples use this text:

I don't really care what you call me. I've been a silent spectator, watching species evolve, empires rise and fall. But always remember, I am mighty and enduring.

Reference audio:

~600K steps (Current Checkpoint)

~550K steps

~500K steps

~450K steps

~400K steps

~350K steps

~300K steps

~250K steps

~200K steps

~150K steps

(Starting to hear traces of the expected text!)

~100K steps

~75K steps

~50K steps

~25K steps

Support

Discussions are disabled until training is complete. For issues, please open a GitHub Issue in the repository.

License

  • Model: CC-BY 4.0 - Free for commercial use
  • Scripts: MIT License

Note: No restrictions are placed on usage of the outputs of the model. While attribution is appreciated, it is not required for outputs of the model.

THE MODEL IS PROVIDED “AS IS” UNDER ITS OPEN LICENSE. THE AUTHORS AND CONTRIBUTORS DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT. USERS ARE SOLELY RESPONSIBLE FOR ENSURING COMPLIANCE WITH APPLICABLE COPYRIGHT LAWS, INCLUDING THE USE OF INPUT DATA AND GENERATED OUTPUTS. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES, INCLUDING BUT NOT LIMITED TO DAMAGES RESULTING FROM LOSS OF USE, DATA, OR PROFITS, OR ANY CLAIMS RELATED TO THE MODEL’S OUTPUTS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE, OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.