A newer version of the Streamlit SDK is available:
1.43.2
title: Audio Emotion Analyzer
emoji: π΅
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.31.0
app_file: app.py
pinned: false
license: mit
Audio Emotion Analyzer
A Streamlit application that analyzes the emotional tone in speech audio files using a pre-trained Wav2Vec2 model.
Model
This application uses the superb/wav2vec2-base-superb-er model from Hugging Face, which is a Wav2Vec2 model fine-tuned for speech emotion recognition.
Demo App
Features
- Upload your own .wav audio files for emotion analysis
- Select from existing .wav files in your current directory
- Real-time emotion prediction
- Visual feedback with emojis
Quick Use
You can use this application in two ways:
Option 1: Run on Hugging Face Spaces
Click the "Spaces" tab on the model page to access the hosted version of this app.
Option 2: Run Locally
- Clone this repository
- Install the required dependencies:
pip install -r requirements.txt
- Download the pre-trained model:
python download_model.py
- Run the Streamlit app:
streamlit run app.py
Using Audio Files
The application automatically scans for .wav files in:
- The current directory where the app is running
- Immediate subdirectories (one level deep)
You can:
- Place .wav files in the same directory as the app
- Place .wav files in subdirectories
- Upload new .wav files directly through the interface
Supported Emotions
The model can detect 7 different emotions:
- Neutral π
- Happy π
- Sad π’
- Angry π
- Fearful π¨
- Disgusted π€’
- Surprised π²
Technical Details
This application uses:
- superb/wav2vec2-base-superb-er pre-trained model
- Wav2Vec2ForSequenceClassification for emotion classification
- Wav2Vec2FeatureExtractor for audio feature extraction
- Streamlit for the web interface
Limitations
- The model works best with clear speech audio in English
- Background noise may affect the accuracy of emotion detection
- Short audio clips (1-5 seconds) tend to work better than longer recordings
Troubleshooting
If you encounter issues with model loading, try:
- Running
python download_model.py
again to download the model files - Ensuring you have a stable internet connection for the initial model download
- Checking that your audio files are in .wav format with a 16kHz sample rate
- Verifying that the model files (pytorch_model.bin, config.json, preprocessor_config.json) are in your current directory
Citation
If you use this application or the underlying model in your work, please cite:
@misc{superb2021,
author = {SUPERB Team},
title = {SUPERB: Speech processing Universal PERformance Benchmark},
year = {2021},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/s3prl/s3prl}},
}
License
This project is licensed under the MIT License - see the LICENSE file for details.