Spaces:

shukdevdatta123
/

Kokoro-TTS

Running

App Files Files Community

shukdevdatta123 commited on Feb 1

Commit

dded7af

verified ·

1 Parent(s): 8c31e4f

Update app.py

Browse files

Files changed (1) hide show

app.py +33 -36

app.py CHANGED Viewed

@@ -9,15 +9,15 @@ st.title("Text-to-Speech Translator with Kokoro")
 # Expander section to display information in multiple languages
 with st.expander("Sample Prompt!"):
-    st.markdown("""
-    - My name is Shukdev. (In English)
-    - Mi nombre es Shukdev. (In Spanish)
-    - Je m'appelle Choukdev. (In French)
-    - मेरा नाम शुकदेव है. (In Hindi)
-    - Il mio nome è Shukdev. (In Italy)
-    - Meu nome é Sukhdev. (In Portuguese, Brazil)
-    - 我叫苏赫德夫。(In Chinese)
-    - 私の名前はスクデフです。(In Japanese)
     """)
 st.sidebar.header("Configuration & Instructions")
@@ -38,23 +38,18 @@ st.sidebar.markdown("""
      - 🇧🇷 **Portuguese** (Brazilian)
      - 🇨🇳 **Chinese** (Mandarin)
      - 🇯🇵 **Japanese**
 3. **Select Voice**:
    - Choose the voice you want for the speech. There are multiple voice styles based on tone and gender (e.g., `af_heart`, `af_joy`, etc.).
 4. **Adjust Speech Speed**:
    - Use the slider to adjust how fast the speech will be generated. The speed can be set from `0.5x` to `2.0x`, with `1.0x` being the default normal speed.
 5. **Generate Speech**:
    - Once you've selected the text, language, voice, and speed, click the **"Generate Audio"** button. The app will process the text and generate the speech.
 6. **Download Audio**:
    - After the audio is generated, you can play it directly within the app or download it as a `.wav` file by clicking the **"Download Audio"** button.
 ### Additional Features:
 - **Text Translation**:
    - The app can automatically translate the text to English before generating audio. After the translation, you will hear the audio in English with your chosen voice.
 Enjoy exploring different languages, voices, and speeds with the text-to-speech conversion!
 """)
@@ -84,8 +79,8 @@ speed = st.slider("Speed", min_value=0.5, max_value=2.0, value=1.0, step=0.1)
 # Initialize the TTS pipeline with user-selected language
 pipeline = KPipeline(lang_code=lang_code)
-# Function to get the OpenAI API key from the user
-openai_api_key = st.text_input("Enter your OpenAI API Key:", type="password")
 # Function to translate text to English using OpenAI's Chat API
 def translate_to_english(api_key, text, lang_code):
@@ -96,12 +91,10 @@ def translate_to_english(api_key, text, lang_code):
         response = openai.ChatCompletion.create(
             model="gpt-4",  # Using ChatGPT model for translation
-            messages=[
-                {"role": "system", "content": "You are a helpful assistant that translates text."},
-                {"role": "user", "content": prompt}
-            ]
         )
         # Extract translated text from response
         translated_text = response['choices'][0]['message']['content'].strip()
         return translated_text
@@ -122,7 +115,9 @@ def generate_audio(text, lang_code, voice, speed):
         return buffer
 # Generate and display the audio file
-if st.button('Generate Audio') and openai_api_key:
     st.write("Generating speech for the original text...")
     audio_buffer = generate_audio(input_text, lang_code, voice, speed)
@@ -137,20 +132,22 @@ if st.button('Generate Audio') and openai_api_key:
         mime="audio/wav"
     )
-    # Translate the input text to English using OpenAI
-    translated_text = translate_to_english(openai_api_key, input_text, lang_code)
-    # Generate audio for the translated English text
-    translated_audio_buffer = generate_audio(translated_text, 'a', voice, speed)
-    # Display Audio for the translated text
-    st.write(f"Translated Text: {translated_text}")
-    st.audio(translated_audio_buffer, format='audio/wav')
-    # Optional: Save the generated audio file for download (Translated Text)
-    st.download_button(
-        label="Download Audio (Translated to English)",
-        data=translated_audio_buffer,
-        file_name="generated_speech_translated.wav",
-        mime="audio/wav"
-    )

 # Expander section to display information in multiple languages
 with st.expander("Sample Prompt!"):
+    st.markdown("""
+    - My name is Shukdev. (In English)
+    - Mi nombre es Shukdev. (In Spanish)
+    - Je m'appelle Choukdev. (In French)
+    - मेरा नाम शुकदेव है. (In Hindi)
+    - Il mio nome è Shukdev. (In Italy)
+    - Meu nome é Sukhdev. (In Portuguese, Brazil)
+    - 我叫苏赫德夫。(In Chinese)
+    - 私の名前はスクデフです。(In Japanese)
     """)
 st.sidebar.header("Configuration & Instructions")
      - 🇧🇷 **Portuguese** (Brazilian)
      - 🇨🇳 **Chinese** (Mandarin)
      - 🇯🇵 **Japanese**
 3. **Select Voice**:
    - Choose the voice you want for the speech. There are multiple voice styles based on tone and gender (e.g., `af_heart`, `af_joy`, etc.).
 4. **Adjust Speech Speed**:
    - Use the slider to adjust how fast the speech will be generated. The speed can be set from `0.5x` to `2.0x`, with `1.0x` being the default normal speed.
 5. **Generate Speech**:
    - Once you've selected the text, language, voice, and speed, click the **"Generate Audio"** button. The app will process the text and generate the speech.
 6. **Download Audio**:
    - After the audio is generated, you can play it directly within the app or download it as a `.wav` file by clicking the **"Download Audio"** button.
 ### Additional Features:
 - **Text Translation**:
    - The app can automatically translate the text to English before generating audio. After the translation, you will hear the audio in English with your chosen voice.
 Enjoy exploring different languages, voices, and speeds with the text-to-speech conversion!
 """)
 # Initialize the TTS pipeline with user-selected language
 pipeline = KPipeline(lang_code=lang_code)
+# Function to get the OpenAI API key from the user (optional for translation)
+openai_api_key = st.text_input("Enter your OpenAI API Key (Optional for Translation)", type="password")
 # Function to translate text to English using OpenAI's Chat API
 def translate_to_english(api_key, text, lang_code):
         response = openai.ChatCompletion.create(
             model="gpt-4",  # Using ChatGPT model for translation
+            messages=[{"role": "system", "content": "You are a helpful assistant that translates text."},
+                      {"role": "user", "content": prompt}]
         )
         # Extract translated text from response
         translated_text = response['choices'][0]['message']['content'].strip()
         return translated_text
         return buffer
 # Generate and display the audio file
+if st.button('Generate Audio'):
+    # Generate audio for the original text
     st.write("Generating speech for the original text...")
     audio_buffer = generate_audio(input_text, lang_code, voice, speed)
         mime="audio/wav"
     )
+    # Check if OpenAI API Key is provided for translation and English audio generation
+    if openai_api_key:
+        # Translate the input text to English using OpenAI
+        translated_text = translate_to_english(openai_api_key, input_text, lang_code)
+        # Generate audio for the translated English text
+        translated_audio_buffer = generate_audio(translated_text, 'a', voice, speed)
+        # Display Audio for the translated text
+        st.write(f"Translated Text: {translated_text}")
+        st.audio(translated_audio_buffer, format='audio/wav')
+        # Optional: Save the generated audio file for download (Translated Text)
+        st.download_button(
+            label="Download Audio (Translated to English)",
+            data=translated_audio_buffer,
+            file_name="generated_speech_translated.wav",
+            mime="audio/wav"
+        )