|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
datasets: |
|
- stanfordnlp/imdb |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
tags: |
|
- minGRU |
|
- sentiment_analysis |
|
- custom_code |
|
- hf_integration |
|
--- |
|
|
|
# MinGRU Sentiment Analysis |
|
|
|
 |
|
|
|
First Hugging Face integration of minGRU models from the paper "[**Were RNNs All We Needed?**](https://arxiv.org/abs/2410.01201)". |
|
|
|
This model uses BERT-Base-Uncased tokenizer. |
|
|
|
For modeling and configuration codes: [**minGRU-hf**](https://github.com/suayptalha/minGRU-hf/tree/main) |
|
|
|
# Example Usage: |
|
```py |
|
from transformers import AutoModelForSequenceClassification |
|
model = AutoModelForSequenceClassification.from_pretrained( |
|
"suayptalha/minGRU-Sentiment-Analysis", |
|
trust_remote_code = True |
|
).to("cuda") |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") |
|
|
|
text = "The movie was absolutely wonderful, I loved it!" |
|
|
|
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128).to(device) |
|
|
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
logits = outputs.logits |
|
prediction = torch.argmax(logits, dim=-1).item() |
|
|
|
sentiment = "positive" if prediction == 1 else "negative" |
|
print(f"Text: {text}") |
|
print(f"Predicted sentiment: {sentiment}") |
|
``` |
|
|
|
> Text: The movie was absolutely wonderful, I loved it! |
|
> Predicted sentiment: positive |
|
|
|
# Training: |
|
|
|
Training code: |
|
|
|
```py |
|
from torch.optim import AdamW |
|
from torch.nn import CrossEntropyLoss |
|
import matplotlib.pyplot as plt |
|
from tqdm import tqdm |
|
|
|
optimizer = AdamW(model.parameters(), lr=5e-5) |
|
criterion = CrossEntropyLoss() |
|
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
model.to(device) |
|
|
|
num_epochs = 5 |
|
loss_values = [] |
|
|
|
for epoch in range(num_epochs): |
|
model.train() |
|
epoch_loss = 0 |
|
progress_bar = tqdm(train_dataloader, desc=f"Epoch {epoch + 1}") |
|
|
|
for batch in progress_bar: |
|
input_ids = batch["input_ids"].to(device) |
|
labels = batch["label"].to(device) |
|
|
|
optimizer.zero_grad() |
|
outputs = model(input_ids=input_ids, labels=labels) |
|
loss = outputs.loss |
|
loss.backward() |
|
optimizer.step() |
|
|
|
epoch_loss += loss.item() |
|
progress_bar.set_postfix(loss=epoch_loss / len(progress_bar)) |
|
|
|
avg_loss = epoch_loss / len(progress_bar) |
|
loss_values.append(avg_loss) |
|
|
|
# Loss Graph |
|
plt.figure(figsize=(10, 6)) |
|
plt.plot(range(1, num_epochs + 1), loss_values, marker='o', label='Training Loss') |
|
plt.xlabel("Epoch") |
|
plt.ylabel("Loss") |
|
plt.title("Training Loss Over Epochs") |
|
plt.legend() |
|
plt.grid(True) |
|
plt.show() |
|
``` |
|
|
|
You can use this code snippet for fine-tuning! |
|
|
|
# Loss Graph: |
|
|
|
 |
|
|
|
# Credits: |
|
|
|
https://arxiv.org/abs/2410.01201 |
|
|
|
I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers. |