Spaces:
Running
Running
A newer version of the Gradio SDK is available:
5.33.0
metadata
title: English Accent Classifier
emoji: 🗣️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.30.0
app_file: app.py
pinned: false
English Accent Classifier with Video Analysis
This Gradio application analyzes English accents from audio extracted from video files. You can provide a video either via a direct URL or by uploading a file from your local machine.
How it Works
- Input Video: Provide a video URL (MP4, Loom, Dropbox, Google Drive direct links) or upload a video file.
- Video Processing: The application downloads/processes the video.
- Audio Extraction: The full audio and a short segment (15 seconds) are extracted.
- Language Detection: The short audio is transcribed, and the language is detected.
- Accent Classification (if English): A longer audio segment (adjustable duration) is analyzed for English accent.
- Results: The detected language, predicted accent, confidence scores, and an audio player for the full extracted audio are displayed.
Features
- English Accent Classification: Predicts the accent in English audio.
- Language Detection: Ensures the audio is English before accent analysis.
- Flexible Video Input: Supports URLs and file uploads.
- Adjustable Analysis Duration: Users can set the audio analysis length.
- Audio Playback: Allows users to listen to the extracted audio.
Tech Stack
- Gradio: Interactive web UI.
- Hugging Face Transformers: Pre-trained models and pipelines.
- Requests: Downloading video files.
- MoviePy: Video editing for audio extraction.
- PyTorch: Underlying deep learning framework.
- Soundfile: Audio file handling.
Models Used
- Accent Classification:
dima806/english_accents_classification
- Language Detection:
alexneakameni/language_detection
- Automatic Speech Recognition:
openai/whisper-tiny.en
Usage
You can interact with the application directly in your browser. Provide a video URL or upload a file, adjust the analysis duration, and click "Analyze Video". The results will be displayed below.
Input Formats
- Uploaded Video Files:
.mp4
- Video URLs:
- Direct MP4 links (ending in
.mp4
) - Loom video share links (
https://www.loom.com/share/...
) - Dropbox direct download links (MP4 links ending in
?dl=1
) - Google Drive direct download links (
https://drive.google.com/uc?id=...&export=download
)
- Direct MP4 links (ending in
Unsupported Formats
- Webpages embedding videos (e.g., YouTube, news articles).
- Dropbox shared folder links.
FFmpeg Requirement
This application requires FFmpeg to be installed on your system for audio extraction from video files. Follow the installation instructions for your operating system on the FFmpeg website.
Troubleshooting
- "Invalid URL": Ensure the URL meets the specified format requirements.
- Audio/Video Processing Errors: Likely due to missing or incorrectly configured FFmpeg.
- Transcription Errors: Audio may be unclear or contain little speech in the initial 15 seconds.
- Non-English Language Detection: The model is designed for English accent classification only.
Citation
If you use this application in your work, please consider citing the original models and the libraries used.
@misc{huggingface_transformers,
author = dima806,
title = dima806/english_accents_classification,
year = Oct 19, 2024,
howpublished = https://huggingface.co/dima806/english_accents_classification
author = alexneakameni,
title = language_detection,
year = Oct 19, 2024,
howpublished = https://huggingface.co/alexneakameni/language_detection
}