Update README.md
Browse files
README.md
CHANGED
@@ -14,10 +14,10 @@ Model Description
|
|
14 |
t5-DistillingSbS-ABSA is a fine-tuned t5-large model designed to perform Aspect-Based Sentiment Analysis (ABSA), particularly for the task of Aspect Pair Sentiment Extraction.
|
15 |
I used a training approach called Distilling Step-by-Step originally proposed in [This Paper](https://arxiv.org/abs/2305.02301) by Hsieh et al. at Google Research
|
16 |
|
17 |
-
Data Annotation: Reviews were initially unannotated and were labeled using the OpenAI API for aspect-sentiment pairs and rationales.
|
18 |
-
|
19 |
Dataset
|
20 |
-
The dataset
|
|
|
|
|
21 |
|
22 |
Training was performed using Hugging Face's Trainer API in Google Colaboratory using 1 A100 GPU with 40 GB of VRAM.
|
23 |
Training took around 6 hours with a cost of about 80 compute units.
|
@@ -27,6 +27,8 @@ Hyperparameters
|
|
27 |
Some of the key hyperparameters used for fine-tuning:
|
28 |
|
29 |
Batch Size: 3
|
|
|
|
|
30 |
Learning Rate: 1e-4
|
31 |
Epochs: 5
|
32 |
Max Sequence Length: 512
|
|
|
14 |
t5-DistillingSbS-ABSA is a fine-tuned t5-large model designed to perform Aspect-Based Sentiment Analysis (ABSA), particularly for the task of Aspect Pair Sentiment Extraction.
|
15 |
I used a training approach called Distilling Step-by-Step originally proposed in [This Paper](https://arxiv.org/abs/2305.02301) by Hsieh et al. at Google Research
|
16 |
|
|
|
|
|
17 |
Dataset
|
18 |
+
The dataset consisted of customer reviews of mobile apps that were originally unannotated. They were scraped and collected by Martens et al. for their paper titled ["On the Emotion of Users in App Reviews"](https://ieeexplore.ieee.org/document/7961885).
|
19 |
+
The data was annotated via the OpenAI API and the model gpt-3.5-turbo, with each review labeled for specific aspects (e.g., UI, functionality, performance) and the corresponding sentiment (positive, negative, neutral).
|
20 |
+
Additionally, sentence-long rationales were extracted to justify the aspect-sentiment pair annotations, aiding in the Distilling Step-by-Step training.
|
21 |
|
22 |
Training was performed using Hugging Face's Trainer API in Google Colaboratory using 1 A100 GPU with 40 GB of VRAM.
|
23 |
Training took around 6 hours with a cost of about 80 compute units.
|
|
|
27 |
Some of the key hyperparameters used for fine-tuning:
|
28 |
|
29 |
Batch Size: 3
|
30 |
+
Gradient Accumulation Steps: 12
|
31 |
+
Optimizer: AdamW
|
32 |
Learning Rate: 1e-4
|
33 |
Epochs: 5
|
34 |
Max Sequence Length: 512
|