--- title: English Accent Classifier emoji: 🗣️ colorFrom: blue colorTo: purple sdk: gradio sdk_version: 5.30.0 app_file: app.py pinned: false --- # English Accent Classifier with Video Analysis This Gradio application analyzes English accents from audio extracted from video files. You can provide a video either via a direct URL or by uploading a file from your local machine. ## How it Works 1. **Input Video:** Provide a video URL (MP4, Loom, Dropbox, Google Drive direct links) or upload a video file. 2. **Video Processing:** The application downloads/processes the video. 3. **Audio Extraction:** The full audio and a short segment (15 seconds) are extracted. 4. **Language Detection:** The short audio is transcribed, and the language is detected. 5. **Accent Classification (if English):** A longer audio segment (adjustable duration) is analyzed for English accent. 6. **Results:** The detected language, predicted accent, confidence scores, and an audio player for the full extracted audio are displayed. ## Features * **English Accent Classification:** Predicts the accent in English audio. * **Language Detection:** Ensures the audio is English before accent analysis. * **Flexible Video Input:** Supports URLs and file uploads. * **Adjustable Analysis Duration:** Users can set the audio analysis length. * **Audio Playback:** Allows users to listen to the extracted audio. ## Tech Stack * [Gradio](https://gradio.app/): Interactive web UI. * [Hugging Face Transformers](https://huggingface.co/transformers/): Pre-trained models and pipelines. * [Requests](https://requests.readthedocs.io/en/latest/): Downloading video files. * [MoviePy](https://zulko.github.io/moviepy/): Video editing for audio extraction. * [PyTorch](https://pytorch.org/): Underlying deep learning framework. * [Soundfile](https://pysoundfile.readthedocs.io/en/latest/): Audio file handling. ## Models Used * **Accent Classification:** `dima806/english_accents_classification` * **Language Detection:** `alexneakameni/language_detection` * **Automatic Speech Recognition:** `openai/whisper-tiny.en` ## Usage You can interact with the application directly in your browser. Provide a video URL or upload a file, adjust the analysis duration, and click "Analyze Video". The results will be displayed below. ### Input Formats * **Uploaded Video Files:** `.mp4` * **Video URLs:** * Direct MP4 links (ending in `.mp4`) * Loom video share links (`https://www.loom.com/share/...`) * Dropbox direct download links (MP4 links ending in `?dl=1`) * Google Drive direct download links (`https://drive.google.com/uc?id=...&export=download`) ### Unsupported Formats * Webpages embedding videos (e.g., YouTube, news articles). * Dropbox shared folder links. ## FFmpeg Requirement This application requires [FFmpeg](https://ffmpeg.org/) to be installed on your system for audio extraction from video files. Follow the installation instructions for your operating system on the FFmpeg website. ## Troubleshooting * **"Invalid URL"**: Ensure the URL meets the specified format requirements. * **Audio/Video Processing Errors**: Likely due to missing or incorrectly configured FFmpeg. * **Transcription Errors**: Audio may be unclear or contain little speech in the initial 15 seconds. * **Non-English Language Detection**: The model is designed for English accent classification only. ## Citation If you use this application in your work, please consider citing the original models and the libraries used. ```bibtex @misc{huggingface_transformers, author = dima806, title = dima806/english_accents_classification, year = Oct 19, 2024, howpublished = https://huggingface.co/dima806/english_accents_classification author = alexneakameni, title = language_detection, year = Oct 19, 2024, howpublished = https://huggingface.co/alexneakameni/language_detection }