|
--- |
|
language: en |
|
tags: |
|
- emotion-classification |
|
- text-classification |
|
- distilbert |
|
datasets: |
|
- dair-ai/emotion |
|
metrics: |
|
- accuracy |
|
--- |
|
|
|
# Emotion Classification Model |
|
|
|
## Model Description |
|
This model fine-tunes **DistilBERT** for the task of **emotion classification**. It is trained to classify text into one of six emotions: **sadness, joy, love, anger, fear, and surprise**. The model is designed for natural language processing applications where understanding emotions in text is valuable, such as social media analysis, customer feedback, and mental health monitoring. |
|
|
|
## Training and Evaluation |
|
- **Training Dataset:** [dair-ai/emotion](https://huggingface.co/datasets/dair-ai/emotion) (16,000 examples) |
|
- **Validation Accuracy:** 94.5% |
|
- **Test Accuracy:** 93.1% |
|
- **Training Time:** 169.2 seconds (~2 minutes 49 seconds) |
|
- **Hyperparameters:** |
|
- Learning Rate: 5e-5 |
|
- Batch Size (Train): 32 |
|
- Batch Size (Validation): 64 |
|
- Epochs: 3 |
|
- Weight Decay: 0.01 |
|
- Optimizer: AdamW |
|
- Evaluation Strategy: Epoch-based |
|
|
|
## Usage |
|
```python |
|
from transformers import pipeline |
|
|
|
# Load the model from HuggingFace Hub |
|
classifier = pipeline("text-classification", model="your-username/emotion-classification-model") |
|
|
|
# Example usage |
|
text = "I’m so happy today!" |
|
result = classifier(text) |
|
print(result) |
|
|
|
|
|
## Limitations |
|
**Biases in Dataset** |
|
The model was trained on the dair-ai/emotion dataset, which may not represent the full diversity of language use across demographics, regions, or cultures. |
|
As a result, it might underperform on texts containing: |
|
|
|
- Slang or Informal Language |
|
For example, "I'm shook!" may not be accurately classified as an expression of surprise. |
|
|
|
- Non-Standard Grammar or Dialects |
|
Variants like African American Vernacular English (AAVE) or regional dialects might lead to misclassifications. |
|
|
|
- Limited Contextual Understanding |
|
The model processes inputs as isolated pieces of text, without awareness of surrounding context. |
|
For instance: |
|
- Sarcasm |
|
"Oh great, another rainy day!" may not be correctly classified as expressing frustration. |
|
|
|
- Complex or Mixed Emotions |
|
Texts expressing multiple emotions (e.g., "I’m angry but also relieved") may be oversimplified into a single label. |
|
|
|
- Short Texts and Ambiguity |
|
Performance can degrade for very short texts (e.g., one or two words) due to insufficient context. |
|
For example: |
|
- "Wow!" might be classified as joy or surprise depending on subtle cues not present in such brief inputs. |
|
- Ambiguous inputs like "Okay" or "Fine" are challenging without additional context. |
|
|
|
- Domain-Specific Language |
|
The model may underperform on text from specialized domains (e.g., legal, medical, or technical writing) or content involving code-mixed or multilingual inputs. |
|
For example, "Estoy feliz!" might not be recognized as expressing joy without multilingual training. |
|
|
|
|
|
## Potential Improvements |
|
- Data Augmentation |
|
Including additional datasets or generating synthetic data can improve generalization. |
|
|
|
- Longer Training |
|
Training for more epochs could marginally increase accuracy, although diminishing returns are likely. |
|
|
|
- Larger Models |
|
Fine-tuning larger models like BERT or RoBERTa may yield better results for nuanced understanding. |
|
|
|
- Bias Mitigation |
|
Incorporating fairness-aware training methods or balanced datasets could reduce biases. |