Respair commited on
Commit
06c2080
·
verified ·
1 Parent(s): eebd102

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md CHANGED
@@ -17,3 +17,25 @@ The default setting is 44.1khz - 128 Mel bin. if you want to change it to 24khz,
17
  Huge Thanks to [Johnathan Duering](https://github.com/duerig) for his help. I mostly implemented this based on his [STTS2 Fork](https://github.com/duerig/StyleTTS2/tree/main).
18
 
19
  **This is highly experimental, I have not conducted a full session training. I just tested that the loss goes down and the eval samples sound reasonable for ~10K steps of minimal training.**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  Huge Thanks to [Johnathan Duering](https://github.com/duerig) for his help. I mostly implemented this based on his [STTS2 Fork](https://github.com/duerig/StyleTTS2/tree/main).
18
 
19
  **This is highly experimental, I have not conducted a full session training. I just tested that the loss goes down and the eval samples sound reasonable for ~10K steps of minimal training.**
20
+
21
+
22
+ ## Pre-requisites
23
+ 1. Python >= 3.10
24
+ 2. Clone this repository:
25
+ ```bash
26
+ git clone https://github.com/Respaired/HiFormer_Vocoder
27
+ cd HiFormer_Vocoder/Ringformer
28
+ ```
29
+ 3. Install python requirements:
30
+ ```bash
31
+ pip install -r requirements.txt
32
+ ```
33
+
34
+ ## Training
35
+ ```bash
36
+ CUDA_VISIBLE_DEVICES=0,1 accelerate launch train.py --config config_v1.json --[args]
37
+ ```
38
+ For the F0 model training, please refer to [yl4579/PitchExtractor](https://github.com/yl4579/PitchExtractor). This repo includes a pre-trained F0 model on a Mixture of Multilingual data for the previously mentioned configuration. I'm going to quote the HiFTnet's Author: "Still, you may want to train your own F0 model for the best performance, particularly for noisy or non-speech data, as we found that F0 estimation accuracy is essential for the vocoder performance."
39
+
40
+ ## Inference
41
+ Please refer to the notebook [inference.ipynb](https://github.com/Respaired/HiFormer_Vocoder/blob/main/RingFormer/inference.ipynb) for details.