Spaces:
Running
Running
File size: 3,909 Bytes
190f6a1 779c448 190f6a1 9c84d33 f46fc08 14cfa78 9c84d33 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
---
title: English Accent Classifier
emoji: 🗣️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.30.0
app_file: app.py
pinned: false
---
# English Accent Classifier with Video Analysis
This Gradio application analyzes English accents from audio extracted from video files. You can provide a video either via a direct URL or by uploading a file from your local machine.
## How it Works
1. **Input Video:** Provide a video URL (MP4, Loom, Dropbox, Google Drive direct links) or upload a video file.
2. **Video Processing:** The application downloads/processes the video.
3. **Audio Extraction:** The full audio and a short segment (15 seconds) are extracted.
4. **Language Detection:** The short audio is transcribed, and the language is detected.
5. **Accent Classification (if English):** A longer audio segment (adjustable duration) is analyzed for English accent.
6. **Results:** The detected language, predicted accent, confidence scores, and an audio player for the full extracted audio are displayed.
## Features
* **English Accent Classification:** Predicts the accent in English audio.
* **Language Detection:** Ensures the audio is English before accent analysis.
* **Flexible Video Input:** Supports URLs and file uploads.
* **Adjustable Analysis Duration:** Users can set the audio analysis length.
* **Audio Playback:** Allows users to listen to the extracted audio.
## Tech Stack
* [Gradio](https://gradio.app/): Interactive web UI.
* [Hugging Face Transformers](https://huggingface.co/transformers/): Pre-trained models and pipelines.
* [Requests](https://requests.readthedocs.io/en/latest/): Downloading video files.
* [MoviePy](https://zulko.github.io/moviepy/): Video editing for audio extraction.
* [PyTorch](https://pytorch.org/): Underlying deep learning framework.
* [Soundfile](https://pysoundfile.readthedocs.io/en/latest/): Audio file handling.
## Models Used
* **Accent Classification:** `dima806/english_accents_classification`
* **Language Detection:** `alexneakameni/language_detection`
* **Automatic Speech Recognition:** `openai/whisper-tiny.en`
## Usage
You can interact with the application directly in your browser. Provide a video URL or upload a file, adjust the analysis duration, and click "Analyze Video". The results will be displayed below.
### Input Formats
* **Uploaded Video Files:** `.mp4`
* **Video URLs:**
* Direct MP4 links (ending in `.mp4`)
* Loom video share links (`https://www.loom.com/share/...`)
* Dropbox direct download links (MP4 links ending in `?dl=1`)
* Google Drive direct download links (`https://drive.google.com/uc?id=...&export=download`)
### Unsupported Formats
* Webpages embedding videos (e.g., YouTube, news articles).
* Dropbox shared folder links.
## FFmpeg Requirement
This application requires [FFmpeg](https://ffmpeg.org/) to be installed on your system for audio extraction from video files. Follow the installation instructions for your operating system on the FFmpeg website.
## Troubleshooting
* **"Invalid URL"**: Ensure the URL meets the specified format requirements.
* **Audio/Video Processing Errors**: Likely due to missing or incorrectly configured FFmpeg.
* **Transcription Errors**: Audio may be unclear or contain little speech in the initial 15 seconds.
* **Non-English Language Detection**: The model is designed for English accent classification only.
## Citation
If you use this application in your work, please consider citing the original models and the libraries used.
```bibtex
@misc{huggingface_transformers,
author = dima806,
title = dima806/english_accents_classification,
year = Oct 19, 2024,
howpublished = https://huggingface.co/dima806/english_accents_classification
author = alexneakameni,
title = language_detection,
year = Oct 19, 2024,
howpublished = https://huggingface.co/alexneakameni/language_detection
} |