sulaimank commited on
Commit
ed26bd3
·
verified ·
1 Parent(s): d0ded5c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -0
README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - mozilla-foundation/common_voice_17_0
4
+ language:
5
+ - lg
6
+ base_model:
7
+ - speechbrain/tts-tacotron2-ljspeech
8
+ pipeline_tag: text-to-speech
9
+ metrics:
10
+ - mos
11
+ ---
12
+
13
+ <iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=large&v=2" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
14
+ <br/><br/>
15
+
16
+
17
+ # Text-to-Speech (TTS) with Tacotron2 trained on Luganda CommonVoice
18
+
19
+ This repository provides all the necessary tools for Text-to-Speech (TTS) with SpeechBrain.
20
+
21
+ The pre-trained model takes in input a short text and produces a spectrogram in output. One can get the final waveform by applying a vocoder (e.g., HiFIGAN) on top of the generated spectrogram.
22
+
23
+
24
+ ## Install SpeechBrain
25
+
26
+ ```
27
+ pip install speechbrain
28
+ ```
29
+
30
+ Please notice that we encourage you to read our tutorials and learn more about
31
+ [SpeechBrain](https://speechbrain.github.io).
32
+
33
+ ### Perform Text-to-Speech (TTS)
34
+
35
+ ```python
36
+ import torchaudio
37
+ from speechbrain.inference.TTS import Tacotron2
38
+ from speechbrain.inference.vocoders import HIFIGAN
39
+
40
+ # Intialize TTS (tacotron2) and Vocoder (HiFIGAN)
41
+ tacotron2 = Tacotron2.from_hparams(source="sulaimank/tacotron2-cv-females", savedir="tmpdir_tts")
42
+ hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder")
43
+
44
+ # Running the TTS
45
+ mel_output, mel_length, alignment = tacotron2.encode_text("Mary had a little lamb")
46
+
47
+ # Running Vocoder (spectrogram-to-waveform)
48
+ waveforms = hifi_gan.decode_batch(mel_output)
49
+
50
+ # Save the waverform
51
+ torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)
52
+ ```
53
+
54
+ If you want to generate multiple sentences in one-shot, you can do in this way:
55
+
56
+ ```
57
+ from speechbrain.pretrained import Tacotron2
58
+ tacotron2 = Tacotron2.from_hparams(source="speechbrain/TTS_Tacotron2", savedir="tmpdir")
59
+ items = [
60
+ "A quick brown fox jumped over the lazy dog",
61
+ "How much wood would a woodchuck chuck?",
62
+ "Never odd or even"
63
+ ]
64
+ mel_outputs, mel_lengths, alignments = tacotron2.encode_batch(items)
65
+
66
+ ### Limitations
67
+ The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.
68
+
69
+ ```