|
--- |
|
license: apache-2.0 |
|
language: en |
|
tags: |
|
- sentiment-analysis |
|
- roberta |
|
- fine-tuned |
|
datasets: |
|
- custom |
|
metrics: |
|
- accuracy |
|
- precision |
|
- recall |
|
- f1 |
|
base_model: |
|
- FacebookAI/roberta-base |
|
pipeline_tag: text-classification |
|
--- |
|
|
|
# Final Sentiment Model - Go-Raw |
|
|
|
## Model description |
|
This is a fine-tuned `roberta-base` model for multi-class sentiment classification. |
|
It was trained on a custom dataset of ~240k examples with 3 sentiment classes: |
|
- 0: Negative |
|
- 1: Positive |
|
- 2: Neutral |
|
|
|
The model shows significant improvement over the base model on this task. |
|
|
|
## Intended uses & limitations |
|
- β
Suitable for English text sentiment analysis. |
|
- π« Not tested on other languages or domains beyond training data. |
|
- π« Not suitable for detecting abusive, toxic, or hate speech. |
|
|
|
## Training details |
|
- Base model: `roberta-base` |
|
- Epochs: 3 |
|
- Learning rate: 2e-5 |
|
- Batch size: 8 |
|
- Optimizer: AdamW |
|
|
|
## Evaluation |
|
|
|
### Dataset |
|
- Train set: 1,94,038 examples |
|
- Test set: 48,510 examples |
|
|
|
### Performance |
|
|
|
| Metric | Base Model | Fine-tuned Model | |
|
|-------|------------|-------------------| |
|
| Accuracy | 34.1% | **88.1%** | |
|
| Macro F1 | 24.3% | **87.5%** | |
|
| Weighted F1 | 27.1% | **88.1%** | |
|
|
|
### Per-class metrics |
|
|
|
| Class | Precision | Recall | F1-score | |
|
|------|-----------|--------|---------| |
|
| **0 (Negative)** | 85.3% | 83.1% | 84.2% | |
|
| **1 (Neutral)** | 91.4% | 89.8% | 90.5% | |
|
| **2 (Positive)** | 86.0% | 89.4% | 87.7% | |
|
|
|
## How to use |
|
```python |
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
|
model = AutoModelForSequenceClassification.from_pretrained("Go-Raw/final-sentiment-model-go-raw") |
|
tokenizer = AutoTokenizer.from_pretrained("Go-Raw/final-sentiment-model-go-raw") |
|
|
|
text = "I absolutely love this!" |
|
inputs = tokenizer(text, return_tensors="pt") |
|
outputs = model(**inputs) |
|
predicted_class = outputs.logits.argmax().item() |
|
print(predicted_class) |