Update README.md
Browse files
README.md
CHANGED
@@ -1,71 +1,86 @@
|
|
1 |
-
|
2 |
-
base_model: distilbert-base-uncased
|
3 |
-
library_name: peft
|
4 |
-
license: apache-2.0
|
5 |
-
metrics:
|
6 |
-
- accuracy
|
7 |
-
tags:
|
8 |
-
- generated_from_trainer
|
9 |
-
model-index:
|
10 |
-
- name: distilbert-base-uncased-lora-text-classification
|
11 |
-
results: []
|
12 |
-
---
|
13 |
|
14 |
-
|
15 |
-
should probably proofread and complete it, then remove this comment. -->
|
16 |
|
17 |
-
#
|
18 |
|
19 |
-
|
20 |
-
It achieves the following results on the evaluation set:
|
21 |
-
- Loss: 0.4649
|
22 |
-
- Accuracy: {'accuracy': 0.8416206261510129}
|
23 |
|
24 |
-
|
|
|
25 |
|
26 |
-
|
|
|
27 |
|
28 |
-
##
|
|
|
29 |
|
30 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
|
32 |
-
## Training and evaluation data
|
33 |
|
34 |
-
More information needed
|
35 |
|
36 |
-
## Training procedure
|
37 |
-
|
38 |
-
### Training hyperparameters
|
39 |
|
|
|
40 |
The following hyperparameters were used during training:
|
41 |
-
|
42 |
-
-
|
43 |
-
-
|
44 |
-
-
|
45 |
-
-
|
46 |
-
-
|
47 |
-
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
|
52 |
-
|
53 |
-
|
|
54 |
-
|
|
55 |
-
|
|
56 |
-
|
|
57 |
-
|
|
58 |
-
|
|
59 |
-
|
|
60 |
-
|
|
61 |
-
|
|
62 |
-
|
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
67 |
-
-
|
68 |
-
-
|
69 |
-
-
|
70 |
-
|
71 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# This is a custom dataset fine tune llm model using LoRA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
|
3 |
+
### Run the code in Google Colab ---> Change Runtime to "T4 GPU" for faster training
|
|
|
4 |
|
5 |
+
# DistilBERT-base-uncased LoRA Text Classification Model
|
6 |
|
7 |
+
## Model Description
|
8 |
+
This model is a fine-tuned version of `distilbert-base-uncased` on an unspecified dataset. It achieves the following results on the evaluation set:
|
|
|
|
|
9 |
|
10 |
+
- **Loss:** 0.4649
|
11 |
+
- **Accuracy:** 84.16%
|
12 |
|
13 |
+
## Intended Uses & Limitations
|
14 |
+
This is a text-classification based model.
|
15 |
|
16 |
+
## Training and Evaluation Data
|
17 |
+
Look below for more details about the performances.
|
18 |
|
19 |
+
## Steps to follow
|
20 |
+
- Installing the Libraries
|
21 |
+
- Loading the Dataset from HuggingFace
|
22 |
+
- Train_test Split the Dataset
|
23 |
+
- Model
|
24 |
+
- Preprocess Data
|
25 |
+
- Evaluation
|
26 |
+
- Apply untrained base model("distilbert-base-uncased") to text
|
27 |
+
- Train Model using LoRA
|
28 |
+
- Generate Prediction
|
29 |
+
- Save the Model and the Tokenizer
|
30 |
+
- Load the Model and the Tokenizer to test
|
31 |
+
- Push Model to HuggingFaceHub
|
32 |
|
|
|
33 |
|
|
|
34 |
|
|
|
|
|
|
|
35 |
|
36 |
+
### Training Hyperparameters
|
37 |
The following hyperparameters were used during training:
|
38 |
+
|
39 |
+
- **Learning Rate:** 0.001
|
40 |
+
- **Train Batch Size:** 4
|
41 |
+
- **Eval Batch Size:** 4
|
42 |
+
- **Seed:** 42
|
43 |
+
- **Optimizer:** Adam with betas=(0.9,0.999) and epsilon=1e-08
|
44 |
+
- **LR Scheduler Type:** Linear
|
45 |
+
- **Number of Epochs:** 10
|
46 |
+
|
47 |
+
### Training Results
|
48 |
+
| Epoch | Training Loss | Validation Loss | Validation Accuracy |
|
49 |
+
|-------|---------------|-----------------|---------------------|
|
50 |
+
| 1.0 | 0.5924 | 0.5523 | 78.45% |
|
51 |
+
| 2.0 | 0.5983 | 0.5236 | 80.29% |
|
52 |
+
| 3.0 | 0.5703 | 0.4498 | 79.56% |
|
53 |
+
| 4.0 | 0.5526 | 0.4976 | 80.66% |
|
54 |
+
| 5.0 | 0.5326 | 0.4317 | 80.85% |
|
55 |
+
| 6.0 | 0.5851 | 0.4562 | 82.87% |
|
56 |
+
| 7.0 | 0.5466 | 0.4713 | 81.95% |
|
57 |
+
| 8.0 | 0.5494 | 0.5072 | 82.50% |
|
58 |
+
| 9.0 | 0.5748 | 0.4802 | 82.87% |
|
59 |
+
| 10.0 | 0.5001 | 0.4649 | 84.16% |
|
60 |
+
|
61 |
+
## Framework Versions
|
62 |
+
- **PEFT:** 0.12.0
|
63 |
+
- **Transformers:** 4.42.4
|
64 |
+
- **PyTorch:** 2.4.0+cu121
|
65 |
+
- **Datasets:** 2.21.0
|
66 |
+
- **Tokenizers:** 0.19.1
|
67 |
+
|
68 |
+
# Dataset Viewer
|
69 |
+
|
70 |
+
You can view the dataset using the following link:
|
71 |
+
|
72 |
+
[View Twitter Sentiment Preprocessed Dataset](https://huggingface.co/datasets/shukdevdatta123/twitter_sentiment_preprocessed/)
|
73 |
+
|
74 |
+
Simply click the link to open the dataset viewer in your browser.
|
75 |
+
|
76 |
+
# Model Viewer
|
77 |
+
|
78 |
+
You can view the model using the following link:
|
79 |
+
|
80 |
+
[View Model in HuggingFace](https://huggingface.co/shukdevdatta123/distilbert-base-uncased-lora-text-classification/)
|
81 |
+
|
82 |
+
Simply click the link to open the model file in your browser.
|
83 |
+
|
84 |
+
Check out the "Fine-tune LLM.pptx" file for the theory behind this code.
|
85 |
+
|
86 |
+
|