--- library_name: transformers tags: - vietnamese - multi_lingual - audio2text - sp - speech_to_text license: mit language: - vi metrics: - wer - bleu base_model: - openai/whisper-large-v3-turbo pipeline_tag: automatic-speech-recognition ---

Logo

# EraX-WoW-Turbo: Whisper Large-v3 Turbo for Vietnamese and then some, Supercharged and Localized! 🚀 **(A promise fulfilled! MIT License - Absolutely, positively, totally free.)** Get ready to experience speech recognition that's faster than a caffeinated cheetah and accurate enough to impress even your most skeptical tech-savvy friends. EraX-WoW-Turbo is here, built upon the already impressive Whisper Large-v3 Turbo, but with a special sauce that makes it truly shine. Think of it as Whisper Large-v3 after a rigorous training montage and a *lot* of espresso. ## What's the Big Deal? * **Blazing Fast:** We're talking *real-time* transcription. Thanks to the clever optimizations in the Turbo architecture, this model chews through 30 seconds of audio in about 350ms. Forget about waiting; your transcripts will appear practically *before* you finish speaking. (The original Medium model? Bless its heart, it can't keep up.) * **Multilingual Maestro:** EraX-WoW-Turbo isn't just fast; it's a linguistic polyglot. We've fine-tuned it on a diverse dataset covering 11 key languages: * Vietnamese (with love from all 8 regions! We didn't forget any accents 😉) * Hindi * Chinese * English * Russian * German * Ukrainian * Japanese * French * Dutch * Korean We believe this selection provides a strong foundation for a wide range of applications. (Our apologies to our Khmer-speaking and Thailand-speaking friends; we'll get you in the next version! Blame it on old age and forgetfulness. 👴👵) * **Accuracy You Can Trust:** We're still finalizing the benchmark results (coming soon!), but preliminary tests show an impressive WER (Word Error Rate) around 12% across the major languages, including challenging Vietnamese dialects. This thing understands you, even if you've got a *really* strong regional accent. * **Trained with Care:** The model was trained on a substantial dataset (300,000 samples, roughly 1000 hours), covering real-world audio conditions. Noise? No problem! * **Open Source (MIT License):** Do whatever you want, no restrictions. ## Turbocharging Performance (CTranslate2)** While EraX-WoW-Turbo is already lightning-fast, you can unlock *even more* speed by using it with the CTranslate2 library ([https://github.com/OpenNMT/CTranslate2](https://github.com/OpenNMT/CTranslate2)). We're talking about a potential 2.5x speedup! This makes it ideal for applications requiring the absolute lowest latency. ## Use Cases * **Real-time Transcription:** Live captioning, meetings, interviews... anything where speed matters. * **Voice Assistants:** Build responsive and accurate voice-controlled applications. * **Media Subtitling:** Generate subtitles for videos and podcasts quickly and accurately. * **Accessibility Tools:** Empower individuals with hearing impairments. * **Language Learning:** Practice pronunciation and receive instant feedback. * **Combine it with our upcoming EraX translator (around 100ms/sentence latency) for a complete multilingual communication powerhouse! Think instant translation for international conferences or even a travel app.** ## Limitations (Honesty is the Best Policy!) * **Not for Babies (or Whispers):** This model is trained on adult speech. It *might* struggle with the high-pitched cries of infants or very quiet, hushed whispers. (We're working on it!) So use in the right cases. ## Get Involved! We're passionate about making speech recognition accessible to everyone. We encourage you to: * **Try it out!** Download the model and put it to the test. * **Provide feedback:** Let us know what works, what doesn't, and what features you'd like to see. (Be gentle with the criticisms; we're sensitive! 😉) * **Contribute:** If you're a developer, consider contributing to the project. The EraX Team is committed to continuously improving our models. Stay tuned for future updates and even more exciting developments!The EraX Team. ## License: - **MIT** follows Whisper's license. ## Citation 📝 If you find our project useful, we would appreciate it if you could star our repository and cite our work as follows: ``` @article{title={EraX-WoW-Turbo-V1.0: Lắng nghe để Yêu thương.}, author={Nguyễn Anh Nguyên - Phạm Huỳnh Nhật - Cty Bảo hiểm AAA (504h)}, organization={EraX}, year={2025}, url={https://huggingface.co/erax-ai/EraX-WoW-Turbo-V1.0} } ```