trichter
/

t5-DistillingSbS-ABSA

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

t5-DistillingSbS-ABSA / README.md

trichter's picture

Update README.md

2553105 verified 6 months ago

|

history blame contribute delete

3.28 kB

	---
	language:
	- en
	base_model:
	- google-t5/t5-large
	library_name: transformers
	---
	Model: t5-DistillingSbS-ABSA

	Task: Aspect-Based Sentiment Analysis (ABSA) - specifically, Aspect Pair Sentiment Extraction

	Technique: Distilling Step-by-Step (DistillingSbS)

	Model Description

	t5-DistillingSbS-ABSA is a fine-tuned t5-large model designed to perform Aspect-Based Sentiment Analysis (ABSA), particularly for the task of Aspect Pair Sentiment Extraction.
	I used a training approach called Distilling Step-by-Step originally proposed in [This Paper](https://arxiv.org/abs/2305.02301) by Hsieh et al. at Google Research

	Dataset

	The dataset consisted of customer reviews of mobile apps that were originally unannotated. They were scraped and collected by Martens et al. for their paper titled ["On the Emotion of Users in App Reviews"](https://ieeexplore.ieee.org/document/7961885).
	The data was annotated via the OpenAI API and the model gpt-3.5-turbo, with each review labeled for specific aspects (e.g., UI, functionality, performance) and the corresponding sentiment (positive, negative, neutral).
	Additionally, sentence-long rationales were extracted to justify the aspect-sentiment pair annotations, aiding in the Distilling Step-by-Step training.

	Training was performed using Hugging Face's Trainer API in Google Colaboratory using 1 A100 GPU with 40 GB of VRAM.
	Training took around 6 hours with a cost of about 80 compute units.
	With a custom loss function, tokenization function and training loop. All code can be found at my [My GitHub Repository](https://github.com/trichter93/ABSA-LLMs-DistillingSbS/)

	Hyperparameters

	Some of the key hyperparameters used for fine-tuning:

	Batch Size: 3

	Gradient Accumulation Steps: 12

	Optimizer: AdamW

	Learning Rate: 1e-4

	Epochs: 5

	Max Sequence Length: 512

	Example usage:

	base_model_name = 't5-large'
	tokenizer = T5Tokenizer.from_pretrained(base_model_name, model_max_length=512)
	model = T5ForConditionalGeneration.from_pretrained('trichter/t5-DistillingSbS-ABSA')
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
	model.to(device)
	examples = {'appName': ['Google Chrome', 'Google Chrome'], 'review': ['This app is great, the speed is unmatched', 'Bad app, crashes constantly']}
	model_inputs = tokenize_function(examples) # assuming example has the fields 'appName' and 'review'. tokenize_function is in the GitHub repo in data_utils.py.

	outputs = generate(model, model_inputs, return_type = 'labels') # generate() is in the github repo and generates either labels or rationales depening on return_type. Default is 'labels'
	tokenizer.decode(outputs[0], skip_special_tokens=True) # prints '"speed": "positive"'
	tokenizer.decode(outputs[1], skip_special_tokens=True) # prints '"crashes": "negative"'


	outputs = generate(model, model_inputs, return_type = 'rationales')
	tokenizer.decode(outputs[0], skip_special_tokens=True) # prints '"speed": "the review explicitly mentions that the speed of the app is unmatched, indicating satisfaction with its performance in terms of speed."'
	tokenizer.decode(outputs[1], skip_special_tokens=True) # prints '"crashes": "the app crashing constantly is explicitly mentioned as a major issue, indicating dissatisfaction with its stability."'