Spaces:

RaivisDejus
/

LatvianSpeechRecognition

Running

App Files Files Community

Raivis Dejus commited on May 1, 2024

Commit

d7b5c0f

1 Parent(s): 622eda7

Adjusting Youtube tab labels and layout

Browse files

Files changed (1) hide show

app.py +9 -8

app.py CHANGED Viewed

@@ -10,8 +10,8 @@ import tempfile
 import os
 BATCH_SIZE = 8
-FILE_LIMIT_MB = 1000
-YT_LENGTH_LIMIT_S = 3600  # limit to 1 hour YouTube files
 device = 0 if torch.cuda.is_available() else "cpu"
@@ -115,9 +115,9 @@ transcribe = gr.Interface(
         * [small](https://huggingface.co/RaivisDejus/whisper-small-lv) - Reasonably fast, reasonably accurate, requiring reasonable amounts of RAM
-        * [large](https://huggingface.co/AiLab-IMCS-UL/whisper-large-v3-lv-late-cv17) - Most accurate, developed by scientists from [ailab.lv](https://ailab.lv/). Requires most RAM and for best performance should be run on a GPU.
-        To improve speech recognition quality, more data is needed, donate your voice on [Balsu talka](https://balsutalka.lv/)
         """
     ),
     allow_flagging="never",
@@ -131,10 +131,11 @@ yt_transcribe = gr.Interface(
             ("small", "RaivisDejus/whisper-small-lv"),
             ("large", "AiLab-IMCS-UL/whisper-large-v3-lv-late-cv17")
         ], label="Model", value="RaivisDejus/whisper-small-lv"),
-        gr.Textbox(lines=1, placeholder="Paste the URL to a YouTube video here", label="YouTube URL"),
         gr.Radio([("Transcribe", "transcribe"), ("Translate to English", "translate",)], label="Task", value="transcribe")
     ],
-    outputs=["html", "text"],
     title="Latvian speech recognition: Transcribe YouTube",
     description=("""
         Test Latvian speech recognition (STT) models. Three models are available:
@@ -143,9 +144,9 @@ yt_transcribe = gr.Interface(
         * [small](https://huggingface.co/RaivisDejus/whisper-small-lv) - Reasonably fast, reasonably accurate, requiring reasonable amounts of RAM
-        * [large](https://huggingface.co/AiLab-IMCS-UL/whisper-large-v3-lv-late-cv17) - Most accurate, developed by scientists from [ailab.lv](https://ailab.lv/). Requires most RAM and for best performance should be run on a GPU.
-        To improve speech recognition quality, more data is needed, donate your voice on [Balsu talka](https://balsutalka.lv/)
         """
     ),
     allow_flagging="never",

 import os
 BATCH_SIZE = 8
+FILE_LIMIT_MB = 10
+YT_LENGTH_LIMIT_S = 300  # limit to 5min YouTube files
 device = 0 if torch.cuda.is_available() else "cpu"
         * [small](https://huggingface.co/RaivisDejus/whisper-small-lv) - Reasonably fast, reasonably accurate, requiring reasonable amounts of RAM
+        * [large](https://huggingface.co/AiLab-IMCS-UL/whisper-large-v3-lv-late-cv17) - Most accurate, developed by scientists from [ailab.lv](https://ailab.lv/). Requires most RAM and for best performance should be run on a GPU
+        To improve speech recognition quality, more data is needed, add your voice on [Balsu talka](https://balsutalka.lv/)
         """
     ),
     allow_flagging="never",
             ("small", "RaivisDejus/whisper-small-lv"),
             ("large", "AiLab-IMCS-UL/whisper-large-v3-lv-late-cv17")
         ], label="Model", value="RaivisDejus/whisper-small-lv"),
+        gr.Textbox(lines=1, placeholder="Paste the URL to a YouTube video here", label="YouTube URL (max 5min long)"),
         gr.Radio([("Transcribe", "transcribe"), ("Translate to English", "translate",)], label="Task", value="transcribe")
     ],
+    # outputs=["html", "text"],
+    outputs=[gr.HTML(), gr.Textbox(label="Transcription", lines=10)],
     title="Latvian speech recognition: Transcribe YouTube",
     description=("""
         Test Latvian speech recognition (STT) models. Three models are available:
         * [small](https://huggingface.co/RaivisDejus/whisper-small-lv) - Reasonably fast, reasonably accurate, requiring reasonable amounts of RAM
+        * [large](https://huggingface.co/AiLab-IMCS-UL/whisper-large-v3-lv-late-cv17) - Most accurate, developed by scientists from [ailab.lv](https://ailab.lv/). Requires most RAM and for best performance should be run on a GPU
+        To improve speech recognition quality, more data is needed, add your voice on [Balsu talka](https://balsutalka.lv/)
         """
     ),
     allow_flagging="never",