Spaces:
Running
on
Zero
Running
on
Zero
Sync from GitHub repo
Browse filesThis Space is synced from the GitHub repo: https://github.com/SWivid/F5-TTS. Please submit contributions to the Space there
src/f5_tts/infer/SHARED.md
CHANGED
@@ -22,6 +22,8 @@
|
|
22 |
- [F5-TTS Base @ fi @ AsmoKoskinen](#f5-tts-base--fi--asmokoskinen)
|
23 |
- [French](#french)
|
24 |
- [F5-TTS Base @ fr @ RASPIAUDIO](#f5-tts-base--fr--raspiaudio)
|
|
|
|
|
25 |
- [Hindi](#hindi)
|
26 |
- [F5-TTS Small @ hi @ SPRINGLab](#f5-tts-small--hi--springlab)
|
27 |
- [Italian](#italian)
|
@@ -33,8 +35,6 @@
|
|
33 |
- [F5-TTS Base @ ru @ HotDro4illa](#f5-tts-base--ru--hotdro4illa)
|
34 |
- [Spanish](#spanish)
|
35 |
- [F5-TTS Base @ es @ jpgallegoar](#f5-tts-base--es--jpgallegoar)
|
36 |
-
- [German](#german)
|
37 |
-
- [F5-TTS Base @ de @ hvoss-techfak](#f5-tts-base--de--hvoss-techfak)
|
38 |
|
39 |
|
40 |
## Multilingual
|
@@ -99,6 +99,16 @@ Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "
|
|
99 |
- [Discussion about this training can be found here](https://github.com/SWivid/F5-TTS/issues/434).
|
100 |
|
101 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
102 |
## Hindi
|
103 |
|
104 |
#### F5-TTS Small @ hi @ SPRINGLab
|
@@ -175,12 +185,3 @@ Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "
|
|
175 |
|F5-TTS Base|[ckpt & vocab](https://huggingface.co/jpgallegoar/F5-Spanish)|[Voxpopuli](https://huggingface.co/datasets/facebook/voxpopuli) & Crowdsourced & TEDx, 218 hours|cc0-1.0|
|
176 |
|
177 |
- @jpgallegoar [GitHub repo](https://github.com/jpgallegoar/Spanish-F5), Jupyter Notebook and Gradio usage for Spanish model.
|
178 |
-
|
179 |
-
## German
|
180 |
-
|
181 |
-
#### F5-TTS Base @ de @ hvoss-techfak
|
182 |
-
|Model|🤗Hugging Face|Data (Hours)|Model License|
|
183 |
-
|:---:|:------------:|:-----------:|:-------------:|
|
184 |
-
|F5-TTS Base|[ckpt & vocab](https://huggingface.co/hvoss-techfak/F5-TTS-German)|[Mozilla Common Voice 19.0](https://commonvoice.mozilla.org/en/datasets) & 800 hours Crowdsourced |cc-by-nc-4.0|
|
185 |
-
|
186 |
-
- Finetuned by [@hvoss-techfak](https://github.com/hvoss-techfak)
|
|
|
22 |
- [F5-TTS Base @ fi @ AsmoKoskinen](#f5-tts-base--fi--asmokoskinen)
|
23 |
- [French](#french)
|
24 |
- [F5-TTS Base @ fr @ RASPIAUDIO](#f5-tts-base--fr--raspiaudio)
|
25 |
+
- [German](#german)
|
26 |
+
- [F5-TTS Base @ de @ hvoss-techfak](#f5-tts-base--de--hvoss-techfak)
|
27 |
- [Hindi](#hindi)
|
28 |
- [F5-TTS Small @ hi @ SPRINGLab](#f5-tts-small--hi--springlab)
|
29 |
- [Italian](#italian)
|
|
|
35 |
- [F5-TTS Base @ ru @ HotDro4illa](#f5-tts-base--ru--hotdro4illa)
|
36 |
- [Spanish](#spanish)
|
37 |
- [F5-TTS Base @ es @ jpgallegoar](#f5-tts-base--es--jpgallegoar)
|
|
|
|
|
38 |
|
39 |
|
40 |
## Multilingual
|
|
|
99 |
- [Discussion about this training can be found here](https://github.com/SWivid/F5-TTS/issues/434).
|
100 |
|
101 |
|
102 |
+
## German
|
103 |
+
|
104 |
+
#### F5-TTS Base @ de @ hvoss-techfak
|
105 |
+
|Model|🤗Hugging Face|Data (Hours)|Model License|
|
106 |
+
|:---:|:------------:|:-----------:|:-------------:|
|
107 |
+
|F5-TTS Base|[ckpt & vocab](https://huggingface.co/hvoss-techfak/F5-TTS-German)|[Mozilla Common Voice 19.0](https://commonvoice.mozilla.org/en/datasets) & 800 hours Crowdsourced |cc-by-nc-4.0|
|
108 |
+
|
109 |
+
- Finetuned by [@hvoss-techfak](https://github.com/hvoss-techfak)
|
110 |
+
|
111 |
+
|
112 |
## Hindi
|
113 |
|
114 |
#### F5-TTS Small @ hi @ SPRINGLab
|
|
|
185 |
|F5-TTS Base|[ckpt & vocab](https://huggingface.co/jpgallegoar/F5-Spanish)|[Voxpopuli](https://huggingface.co/datasets/facebook/voxpopuli) & Crowdsourced & TEDx, 218 hours|cc0-1.0|
|
186 |
|
187 |
- @jpgallegoar [GitHub repo](https://github.com/jpgallegoar/Spanish-F5), Jupyter Notebook and Gradio usage for Spanish model.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/f5_tts/train/finetune_gradio.py
CHANGED
@@ -434,7 +434,7 @@ def start_training(
|
|
434 |
fp16 = ""
|
435 |
|
436 |
cmd = (
|
437 |
-
f
|
438 |
f" --learning_rate {learning_rate}"
|
439 |
f" --batch_size_per_gpu {batch_size_per_gpu}"
|
440 |
f" --batch_size_type {batch_size_type}"
|
@@ -453,7 +453,7 @@ def start_training(
|
|
453 |
cmd += " --finetune"
|
454 |
|
455 |
if file_checkpoint_train != "":
|
456 |
-
cmd += f
|
457 |
|
458 |
if tokenizer_file != "":
|
459 |
cmd += f" --tokenizer_path {tokenizer_file}"
|
|
|
434 |
fp16 = ""
|
435 |
|
436 |
cmd = (
|
437 |
+
f'accelerate launch {fp16} "{file_train}" --exp_name {exp_name}'
|
438 |
f" --learning_rate {learning_rate}"
|
439 |
f" --batch_size_per_gpu {batch_size_per_gpu}"
|
440 |
f" --batch_size_type {batch_size_type}"
|
|
|
453 |
cmd += " --finetune"
|
454 |
|
455 |
if file_checkpoint_train != "":
|
456 |
+
cmd += f' --pretrain "{file_checkpoint_train}"'
|
457 |
|
458 |
if tokenizer_file != "":
|
459 |
cmd += f" --tokenizer_path {tokenizer_file}"
|