k-m-irfan commited on
Commit
15ae496
·
1 Parent(s): f23edb4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +125 -82
README.md CHANGED
@@ -1,86 +1,10 @@
1
- Skip to content
2
- k-m-irfan
3
- /
4
- Fastspeech2_HS_Flask_API
5
-
6
- Type / to search
7
-
8
- Code
9
- Issues
10
- Pull requests
11
- Actions
12
- Projects
13
- Wiki
14
- Security
15
- Insights
16
- Settings
17
- Editing README.md in Fastspeech2_HS_Flask_API
18
- BreadcrumbsFastspeech2_HS_Flask_API
19
- /
20
- README.md
21
- in
22
- main
23
-
24
- Edit
25
-
26
- Preview
27
- Indent mode
28
-
29
- Spaces
30
- Indent size
31
-
32
- 4
33
- Line wrap mode
34
-
35
- Soft wrap
36
- Editing README.md file contents
37
- Selection deleted
38
- 1
39
- 2
40
- 3
41
- 4
42
- 5
43
- 6
44
- 7
45
- 8
46
- 9
47
- 10
48
- 11
49
- 12
50
- 13
51
- 14
52
- 15
53
- 16
54
- 17
55
- 18
56
- 19
57
- 20
58
- 21
59
- 22
60
- 23
61
- 24
62
- 25
63
- 26
64
- 27
65
- 28
66
- 29
67
- 30
68
- 31
69
- 32
70
- 33
71
- 34
72
- 35
73
- 36
74
- 37
75
- 38
76
- 39
77
- 40
78
- 41
79
  ---
80
  Model Type: Text to Speech
81
  Supported Languages: Assamese, Bengali, Bodo, Gujarati, Hindi, Kannada, Malayalam, Manipuri, Marathi, Odia, Punjabi, Rajasthani, Tamil, Telugu, Urdu
82
  ---
 
83
  <img src="https://api.visitorbadge.io/api/visitors?path=https://github.com/k-m-irfan/Fastspeech2_HS_Flask_API&label=VISITORS&countColor=%234285f4" align="right"></br>
 
84
  ***Demo: [IITM-TTS Demo](https://iitm-tts.onrender.com) | This may take approximately 30 seconds to load the first time and will go idle after 15 minutes of inactivity.***
85
 
86
  # Fastspeech2_HS_Flask_API
@@ -117,7 +41,126 @@ git clone https://huggingface.co/k-m-irfan/Fastspeech2_HS_Flask_API
117
  ```
118
 
119
  Alternatively, you can download the models from the original repository [Fastspeech2_HS](https://github.com/smtiitm/Fastspeech2_HS)
120
- Use Control + Shift + m to toggle the tab key moving focus. Alternatively, use esc then tab to move to the next interactive element on the page.
121
- No file chosen
122
- Attach files by dragging & dropping, selecting or pasting them.
123
- Editing Fastspeech2_HS_Flask_API/README.md at main · k-m-irfan/Fastspeech2_HS_Flask_API
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  Model Type: Text to Speech
3
  Supported Languages: Assamese, Bengali, Bodo, Gujarati, Hindi, Kannada, Malayalam, Manipuri, Marathi, Odia, Punjabi, Rajasthani, Tamil, Telugu, Urdu
4
  ---
5
+
6
  <img src="https://api.visitorbadge.io/api/visitors?path=https://github.com/k-m-irfan/Fastspeech2_HS_Flask_API&label=VISITORS&countColor=%234285f4" align="right"></br>
7
+
8
  ***Demo: [IITM-TTS Demo](https://iitm-tts.onrender.com) | This may take approximately 30 seconds to load the first time and will go idle after 15 minutes of inactivity.***
9
 
10
  # Fastspeech2_HS_Flask_API
 
41
  ```
42
 
43
  Alternatively, you can download the models from the original repository [Fastspeech2_HS](https://github.com/smtiitm/Fastspeech2_HS)
44
+ and organize the folder structure as specified below. Skip this step if already cloned the repository from Hugging Face.
45
+
46
+ ```bash
47
+ models
48
+ ├── hindi
49
+ │ ├── female
50
+ │ └── male
51
+ ├── tamil
52
+ │ ├── female
53
+ │ └── male
54
+ .
55
+ .
56
+ .
57
+ └── marathi
58
+ ├── female
59
+ └── male
60
+ ```
61
+
62
+ ### Installation:
63
+
64
+ Create a virtual environment and activate it:
65
+ ```bash
66
+ python3 -m venv tts-hs-hifigan
67
+ source tts-hs-hifigan/bin/activate
68
+ ```
69
+
70
+ Install the required dependencies by running:
71
+ ```bash
72
+ pip install -r requirements.txt
73
+ ```
74
+
75
+ ### Run Flask server:
76
+ Ensure the server application is running correctly before proceeding. Use the following commands and check for any errors:
77
+ ```bash
78
+ python3 flask_app.py
79
+ # OR
80
+ gunicorn -w 2 -b 0.0.0.0:5000 flask_app:app --timeout 600
81
+ ```
82
+
83
+ If the application is running without any issues, proceed to start the server using the following command:
84
+ ```bash
85
+ bash start.sh
86
+ ```
87
+
88
+ ### API
89
+ ```python
90
+ """
91
+ This is a sample API code to send a text to the server and recieve speech
92
+ for the given text.
93
+
94
+ Supported languages:
95
+
96
+ Assamese, Bengali, Bodo, Gujarati, Hindi, Kannada, Malayalam, Manipuri
97
+ Marathi, Odia, Punjabi, Rajasthani, Tamil, Telugu, Urdu
98
+
99
+ """
100
+ import requests
101
+ import json
102
+ import base64
103
+
104
+ # endpoint
105
+ url = "http://localhost:5000/tts"
106
+
107
+ lang = 'hindi'
108
+ gender = 'female'
109
+ text = "सुप्रभात, आप कैसे हैं?" # hindi
110
+ # text = "സുപ്രഭാതം, സുഖമാ?" # malayalam
111
+ # text = "সুপ্ৰভাত, তুমি কেনে?" # manipuri
112
+ # text = "सुप्रभात, तुम्ही कसे आहात?" # marathi
113
+ # text = "ಶುಭೋದಯ, ನೀವು ಹೇಗಿದ್ದೀರಿ?" # kannada
114
+ # text = "बसु म्विथ्बो, बरि दिबाबो?" # bodo male yet to be added <---
115
+ # text = "Good morning, how are you?" # english
116
+ # text = "সুপ্ৰভাত, আপুনি কেমন আছে?" # assamese
117
+ # text = "காலை வணக்கம், நீங்கள் எப்படி இருக்கின்றீர்கள்?" # tamil
118
+ # text = "ସୁପ୍ରଭାତ, ଆପଣ କେମିତି ଅଛନ୍ତି?"
119
+ # text = "सुप्रभात, आप कैसे छो?" # rajasthani
120
+ # text = "శుభోదయం, మీరు ఎలా ఉన్నారు?" # telugu
121
+ # text = "সুপ্রভাত, আপনি কেমন আছেন?" # bengali
122
+ # text = "સુપ્રભાત, તમે કેમ છો?" # gujarati
123
+
124
+ payload = json.dumps(
125
+ {
126
+ "input": text,
127
+ "gender": gender,
128
+ "lang": lang,
129
+ "alpha": 1 # to control speed
130
+ })
131
+
132
+ headers = {'Content-Type': 'application/json'}
133
+ response = requests.request("POST", url, headers=headers, data=payload).json()
134
+
135
+ # save the received encoded audio
136
+ audio = response['audio']
137
+ file_name = "tts.wav"
138
+ wav_file = open(file_name,'wb')
139
+ decode_string = base64.b64decode(audio)
140
+ wav_file.write(decode_string)
141
+ wav_file.close()
142
+ ```
143
+
144
+ ### Citation for the original repo
145
+ If you use this Fastspeech2 Model in your research or work, please consider citing:
146
+
147
+
148
+ COPYRIGHT
149
+ 2023, Speech Technology Consortium,
150
+ Bhashini, MeiTY and by Hema A Murthy & S Umesh,
151
+ DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
152
+ and
153
+ ELECTRICAL ENGINEERING,
154
+ IIT MADRAS. ALL RIGHTS RESERVED "
155
+
156
+
157
+ Shield: [![CC BY 4.0][cc-by-shield]][cc-by]
158
+
159
+ This work is licensed under a
160
+ [Creative Commons Attribution 4.0 International License][cc-by].
161
+
162
+ [![CC BY 4.0][cc-by-image]][cc-by]
163
+
164
+ [cc-by]: http://creativecommons.org/licenses/by/4.0/
165
+ [cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
166
+ [cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg