mrfakename commited on
Commit
c90e4fc
·
verified ·
1 Parent(s): 1583e1c

Sync from GitHub repo

Browse files

This Space is synced from the GitHub repo: https://github.com/SWivid/F5-TTS. Please submit contributions to the Space there

src/f5_tts/infer/README.md CHANGED
@@ -9,12 +9,12 @@ Currently support **30s for a single** generation, which is the **total length**
9
  To avoid possible inference failures, make sure you have seen through the following instructions.
10
 
11
  - Use reference audio <12s and leave proper silence space (e.g. 1s) at the end. Otherwise there is a risk of truncating in the middle of word, leading to suboptimal generation.
12
- - **Uppercased** letters (best with form like K.F.C.) will be uttered letter by letter, and lowercased letters used for common words.
13
- - Add some spaces (blank: " ") or punctuations (e.g. "," ".") to explicitly introduce some **pauses**.
14
  - If English punctuation marks the end of a sentence, make sure there is a space " " after it. Otherwise not regarded as when chunk.
15
- - Preprocess **numbers** to Chinese letters if you want to have them read in Chinese, otherwise in English.
16
- - If the generation output is blank (pure silence), check for **ffmpeg** installation.
17
- - Try turn off **use_ema** if using an early-stage finetuned checkpoint (which goes just few updates).
18
 
19
 
20
  ## Gradio App
 
9
  To avoid possible inference failures, make sure you have seen through the following instructions.
10
 
11
  - Use reference audio <12s and leave proper silence space (e.g. 1s) at the end. Otherwise there is a risk of truncating in the middle of word, leading to suboptimal generation.
12
+ - <ins>Uppercased letters</ins> (best with form like K.F.C.) will be uttered letter by letter, and lowercased letters used for common words.
13
+ - Add some spaces (blank: " ") or punctuations (e.g. "," ".") <ins>to explicitly introduce some pauses</ins>.
14
  - If English punctuation marks the end of a sentence, make sure there is a space " " after it. Otherwise not regarded as when chunk.
15
+ - <ins>Preprocess numbers</ins> to Chinese letters if you want to have them read in Chinese, otherwise in English.
16
+ - If the generation output is blank (pure silence), <ins>check for ffmpeg installation</ins>.
17
+ - Try <ins>turn off `use_ema` if using an early-stage</ins> finetuned checkpoint (which goes just few updates).
18
 
19
 
20
  ## Gradio App
src/f5_tts/train/README.md CHANGED
@@ -51,10 +51,10 @@ Discussion board for Finetuning [#57](https://github.com/SWivid/F5-TTS/discussio
51
 
52
  Gradio UI training/finetuning with `src/f5_tts/train/finetune_gradio.py` see [#143](https://github.com/SWivid/F5-TTS/discussions/143).
53
 
54
- The **`use_ema = True` might be harmful for early-stage finetuned checkpoints** (which goes just few updates, thus ema weights still dominated by pretrained ones), try turn it off (`load_model(..., use_ema=False)`) and see if offer better results.
55
-
56
  If use tensorboard as logger, install it first with `pip install tensorboard`.
57
 
 
 
58
  ### 3. W&B Logging
59
 
60
  The `wandb/` dir will be created under path you run training/finetuning scripts.
 
51
 
52
  Gradio UI training/finetuning with `src/f5_tts/train/finetune_gradio.py` see [#143](https://github.com/SWivid/F5-TTS/discussions/143).
53
 
 
 
54
  If use tensorboard as logger, install it first with `pip install tensorboard`.
55
 
56
+ <ins>The `use_ema = True` might be harmful for early-stage finetuned checkpoints</ins> (which goes just few updates, thus ema weights still dominated by pretrained ones), try turn it off with finetune gradio option or `load_model(..., use_ema=False)`, see if offer better results.
57
+
58
  ### 3. W&B Logging
59
 
60
  The `wandb/` dir will be created under path you run training/finetuning scripts.