mah92 commited on
Commit
4526d1a
·
verified ·
1 Parent(s): 51fb765

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -82
README.md CHANGED
@@ -20,89 +20,9 @@ You can test this model [here](https://huggingface.co/spaces/k2-fsa/text-to-spee
20
 
21
  Enjoy!
22
 
23
- ## Usage with the Sherpa-onnx repo
24
 
25
- Remember to add metadata to onnx file as in:
26
- https://github.com/k2-fsa/icefall/blob/master/egs/ljspeech/TTS/matcha/export_onnx.py#L174
27
-
28
- ## Usage with the Matcha-TTS repo
29
- 1) In matcha/text/cleaners.py, phonemizer.backend.EspeakBackend part:
30
- ```
31
- language="fa",
32
- ```
33
-
34
- 2) pip install piper-phonemize
35
-
36
- 3) In cleaners.py:
37
-
38
- add below persian_cleaners_piper:
39
- ```
40
- import piper_phonemize
41
- def persian_cleaners_piper(text):
42
- """Pipeline for Persian text, including abbreviation expansion. + punctuation + stress"""
43
- #text = convert_to_ascii(text)
44
- text = lowercase(text)
45
- text = expand_abbreviations(text)
46
- phonemes = "".join(piper_phonemize.phonemize_espeak(text=text, voice="fa")[0])
47
- phonemes = collapse_whitespace(phonemes)
48
-
49
- # Remove unwanted symbols (e.g., '1')
50
- unwanted_symbols = {'1', '-'} # Add any other unwanted symbols here
51
- filtered_phonemes = "".join([char for char in phonemes if char not in unwanted_symbols])
52
-
53
- return filtered_phonemes
54
- ```
55
-
56
- 4) In matcha/text/cleaners.py change this line to:
57
- ```
58
- intersperse(text_to_sequence(text, ["persian_cleaners_piper"])[0], 0),
59
- ```
60
-
61
- 5) Also set cleaner in configs/data/custom.yaml:
62
- cleaners: [persian_cleaners_piper]
63
-
64
- 6) replace symbols.py by:
65
- ```
66
- def read_tokens():
67
- tokens = []
68
- with open("/home/oem/Basir/TTS/Matcha/Matcha-TTS/configs/tokens/tokens_sherpa_with_fa.txt", "r", encoding="utf-8") as f:
69
- for line in f:
70
- # Remove the newline character at the end
71
- line = line.rstrip("\n")
72
- # Split into token and number, preserving whitespace
73
- if " " in line:
74
- token = line[:line.index(" ")] # Extract everything before the first space
75
- if len(token) == 0: # White-space
76
- token = ' '
77
- else:
78
- token = line # If there's no space, the entire line is the token
79
- tokens.append(token)
80
- return tokens
81
-
82
- symbols = read_tokens()
83
- ```
84
- 7) For possible errors, change save_figure_to_numpy to:
85
- ```
86
- import numpy as np
87
- import matplotlib.pyplot as plt
88
- from PIL import Image
89
- import io
90
-
91
- def save_figure_to_numpy(fig):
92
- buf = io.BytesIO()
93
- fig.savefig(buf, format='png', bbox_inches='tight', pad_inches=0)
94
- buf.seek(0)
95
- img = Image.open(buf)
96
- data = np.array(img)
97
- buf.close()
98
-
99
- return data
100
- ```
101
-
102
- 8) After exporting to onnx, add sherpa metadata if you want to use the model with sherpa
103
- ```
104
- python3 ./add_sherpa_metadata_to_matcha.py
105
- ```
106
 
107
  ## Training results
108
  ![Training Results](khadijah-22050.png)
 
20
 
21
  Enjoy!
22
 
23
+ ## Training method
24
 
25
+ see: [how_to_train_matcha_tts](https://huggingface.co/mah92/how_to_train_matcha_tts)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
  ## Training results
28
  ![Training Results](khadijah-22050.png)