metadata

title: English Accent Classifier
emoji: 🗣️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.30.0
app_file: app.py
pinned: false

English Accent Classifier with Video Analysis

This Gradio application analyzes English accents from audio extracted from video files. You can provide a video either via a direct URL or by uploading a file from your local machine.

How it Works

Input Video: Provide a video URL (MP4, Loom, Dropbox, Google Drive direct links) or upload a video file.
Video Processing: The application downloads/processes the video.
Audio Extraction: The full audio and a short segment (15 seconds) are extracted.
Language Detection: The short audio is transcribed, and the language is detected.
Accent Classification (if English): A longer audio segment (adjustable duration) is analyzed for English accent.
Results: The detected language, predicted accent, confidence scores, and an audio player for the full extracted audio are displayed.

Features

English Accent Classification: Predicts the accent in English audio.
Language Detection: Ensures the audio is English before accent analysis.
Flexible Video Input: Supports URLs and file uploads.
Adjustable Analysis Duration: Users can set the audio analysis length.
Audio Playback: Allows users to listen to the extracted audio.

Tech Stack

Gradio: Interactive web UI.
Hugging Face Transformers: Pre-trained models and pipelines.
Requests: Downloading video files.
MoviePy: Video editing for audio extraction.
PyTorch: Underlying deep learning framework.
Soundfile: Audio file handling.

Models Used

Accent Classification: dima806/english_accents_classification
Language Detection: alexneakameni/language_detection
Automatic Speech Recognition: openai/whisper-tiny.en

Usage

You can interact with the application directly in your browser. Provide a video URL or upload a file, adjust the analysis duration, and click "Analyze Video". The results will be displayed below.

Input Formats

Uploaded Video Files: .mp4
Video URLs:
- Direct MP4 links (ending in .mp4)
- Loom video share links (https://www.loom.com/share/...)
- Dropbox direct download links (MP4 links ending in ?dl=1)
- Google Drive direct download links (https://drive.google.com/uc?id=...&export=download)

Unsupported Formats

Webpages embedding videos (e.g., YouTube, news articles).
Dropbox shared folder links.

FFmpeg Requirement

This application requires FFmpeg to be installed on your system for audio extraction from video files. Follow the installation instructions for your operating system on the FFmpeg website.

Troubleshooting

"Invalid URL": Ensure the URL meets the specified format requirements.
Audio/Video Processing Errors: Likely due to missing or incorrectly configured FFmpeg.
Transcription Errors: Audio may be unclear or contain little speech in the initial 15 seconds.
Non-English Language Detection: The model is designed for English accent classification only.

Citation

If you use this application in your work, please consider citing the original models and the libraries used.

@misc{huggingface_transformers,
    author = dima806,
    title = dima806/english_accents_classification,
    year =  Oct 19, 2024,
    howpublished = https://huggingface.co/dima806/english_accents_classification

    author = alexneakameni,
    title = language_detection,
    year =  Oct 19, 2024,
    howpublished = https://huggingface.co/alexneakameni/language_detection

}