metadata

library_name: transformers
tags:
  - bart-base
  - sentiment classification
  - tweeter
  - binary classification

Model Card for Model ID

This is a fine-tune of bart-base to a sentiment classification dataset.


from transformers import BartTokenizer, AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained('mylonasc/bart-base-twitter-sent-ft-001')

import torch
_phrases = [
    'this is a great model! I really like it!', 
    'Do you call this a model? This is not even 1B parameters! Get outta here!',
    'Fine tuning transformers is very easy if you use all the right tools!',
    "John couldn't write two correct lines of code without ChatGPT if his life depended on it..."
]
toks = tokenizer(_phrases, return_tensors='pt', padding = 'longest')
with torch.no_grad():
    res = model(**toks)[0]
is_positive = res.softmax(1)[:,1]
is_positive


>> tensor([0.9994, 0.1362, 0.9995, 0.3840])

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Model type: BART-model (full-transformer)
Language(s) (NLP): English

Model Sources [optional]

Repository: [More Information Needed]

Uses

Sentiment classification for english sentences.

[More Information Needed]

Recommendations

This is the output of a short technical implementation project to demonstrate fine-tuning models using the transformer library.

The model was trained on very short sentences (max 65 tokens)
Some prelim. benchmarking showed that it does strongly out-perform a Zero-shot BART-large model on sentiment classification.
It is uncertain how the model will behave in other contexts

It is not recommended to use this model - use at your own risk!

How to Get Started with the Model

[More Information Needed]

Training Details

Training Data

A twitter sentiment classification dataset - not from Huggingface.

[More Information Needed]

Training Procedure

5 epochs with 3e-5 learning rate, 256 batch size. 10% validation set held-out for early stopping (not included in the main set after training).

Preprocessing

Some removal of rows with a log of special characters
De-duplication was not necessary (max duplicates were 4, and it was single word tweets)

Hardware

1x RTX4090