Model save
Browse files- README.md +110 -0
- adapter_model.safetensors +1 -1
- all_results.json +8 -0
- runs/Apr24_01-31-15_poseidon/events.out.tfevents.1713922305.poseidon.732971.0 +2 -2
- train_results.json +8 -0
- trainer_state.json +0 -0
README.md
ADDED
@@ -0,0 +1,110 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
library_name: peft
|
4 |
+
tags:
|
5 |
+
- trl
|
6 |
+
- dpo
|
7 |
+
- generated_from_trainer
|
8 |
+
base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
|
9 |
+
model-index:
|
10 |
+
- name: tinyllama-1.1b-chat-dpo-qlora
|
11 |
+
results: []
|
12 |
+
---
|
13 |
+
|
14 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
15 |
+
should probably proofread and complete it, then remove this comment. -->
|
16 |
+
|
17 |
+
# tinyllama-1.1b-chat-dpo-qlora
|
18 |
+
|
19 |
+
This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) on an unknown dataset.
|
20 |
+
It achieves the following results on the evaluation set:
|
21 |
+
- Loss: 0.6085
|
22 |
+
- Rewards/chosen: -1.0876
|
23 |
+
- Rewards/rejected: -1.3914
|
24 |
+
- Rewards/accuracies: 0.6580
|
25 |
+
- Rewards/margins: 0.3038
|
26 |
+
- Logps/rejected: -490.8211
|
27 |
+
- Logps/chosen: -504.9807
|
28 |
+
- Logits/rejected: -2.6096
|
29 |
+
- Logits/chosen: -2.6425
|
30 |
+
|
31 |
+
## Model description
|
32 |
+
|
33 |
+
More information needed
|
34 |
+
|
35 |
+
## Intended uses & limitations
|
36 |
+
|
37 |
+
More information needed
|
38 |
+
|
39 |
+
## Training and evaluation data
|
40 |
+
|
41 |
+
More information needed
|
42 |
+
|
43 |
+
## Training procedure
|
44 |
+
|
45 |
+
### Training hyperparameters
|
46 |
+
|
47 |
+
The following hyperparameters were used during training:
|
48 |
+
- learning_rate: 5e-06
|
49 |
+
- train_batch_size: 4
|
50 |
+
- eval_batch_size: 8
|
51 |
+
- seed: 42
|
52 |
+
- distributed_type: multi-GPU
|
53 |
+
- gradient_accumulation_steps: 4
|
54 |
+
- total_train_batch_size: 16
|
55 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
56 |
+
- lr_scheduler_type: cosine
|
57 |
+
- lr_scheduler_warmup_ratio: 0.1
|
58 |
+
- num_epochs: 1
|
59 |
+
|
60 |
+
### Training results
|
61 |
+
|
62 |
+
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
63 |
+
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
64 |
+
| 0.6921 | 0.03 | 100 | 0.6923 | 0.0160 | 0.0142 | 0.5645 | 0.0018 | -350.2683 | -394.6286 | -2.7841 | -2.8363 |
|
65 |
+
| 0.6894 | 0.05 | 200 | 0.6894 | 0.0433 | 0.0353 | 0.5920 | 0.0080 | -348.1495 | -391.8949 | -2.7811 | -2.8333 |
|
66 |
+
| 0.6815 | 0.08 | 300 | 0.6844 | 0.0806 | 0.0609 | 0.6025 | 0.0196 | -345.5898 | -388.1692 | -2.7838 | -2.8349 |
|
67 |
+
| 0.6869 | 0.1 | 400 | 0.6788 | 0.0607 | 0.0269 | 0.6125 | 0.0339 | -348.9979 | -390.1522 | -2.7931 | -2.8423 |
|
68 |
+
| 0.6744 | 0.13 | 500 | 0.6724 | 0.0243 | -0.0249 | 0.6210 | 0.0492 | -354.1764 | -393.7983 | -2.7889 | -2.8371 |
|
69 |
+
| 0.6679 | 0.16 | 600 | 0.6625 | -0.0566 | -0.1346 | 0.6265 | 0.0780 | -365.1402 | -401.8826 | -2.7709 | -2.8179 |
|
70 |
+
| 0.637 | 0.18 | 700 | 0.6555 | -0.2568 | -0.3654 | 0.6290 | 0.1086 | -388.2211 | -421.9038 | -2.7596 | -2.8051 |
|
71 |
+
| 0.6166 | 0.21 | 800 | 0.6488 | -0.3935 | -0.5223 | 0.6320 | 0.1288 | -403.9116 | -435.5756 | -2.7523 | -2.7961 |
|
72 |
+
| 0.6335 | 0.24 | 900 | 0.6458 | -0.4516 | -0.6042 | 0.6380 | 0.1527 | -412.1083 | -441.3798 | -2.7325 | -2.7764 |
|
73 |
+
| 0.6286 | 0.26 | 1000 | 0.6406 | -0.8692 | -1.0442 | 0.625 | 0.1750 | -456.1026 | -483.1429 | -2.7123 | -2.7531 |
|
74 |
+
| 0.669 | 0.29 | 1100 | 0.6406 | -0.3445 | -0.4984 | 0.6365 | 0.1538 | -401.5222 | -430.6789 | -2.6946 | -2.7354 |
|
75 |
+
| 0.6723 | 0.31 | 1200 | 0.6358 | -0.4619 | -0.6430 | 0.6425 | 0.1811 | -415.9841 | -442.4163 | -2.6701 | -2.7077 |
|
76 |
+
| 0.605 | 0.34 | 1300 | 0.6297 | -0.6894 | -0.8903 | 0.6435 | 0.2009 | -440.7144 | -465.1627 | -2.6764 | -2.7122 |
|
77 |
+
| 0.6361 | 0.37 | 1400 | 0.6267 | -0.7144 | -0.9307 | 0.6505 | 0.2163 | -444.7496 | -467.6648 | -2.6711 | -2.7091 |
|
78 |
+
| 0.6085 | 0.39 | 1500 | 0.6213 | -1.0532 | -1.3084 | 0.6490 | 0.2552 | -482.5256 | -501.5469 | -2.6435 | -2.6797 |
|
79 |
+
| 0.6317 | 0.42 | 1600 | 0.6197 | -1.1246 | -1.3825 | 0.6490 | 0.2579 | -489.9323 | -508.6858 | -2.6172 | -2.6506 |
|
80 |
+
| 0.6702 | 0.44 | 1700 | 0.6182 | -1.0036 | -1.2644 | 0.6530 | 0.2609 | -478.1268 | -496.5815 | -2.6407 | -2.6762 |
|
81 |
+
| 0.5658 | 0.47 | 1800 | 0.6219 | -1.3479 | -1.6348 | 0.6445 | 0.2869 | -515.1606 | -531.0145 | -2.5866 | -2.6182 |
|
82 |
+
| 0.6039 | 0.5 | 1900 | 0.6154 | -0.9014 | -1.1716 | 0.6630 | 0.2702 | -468.8458 | -486.3656 | -2.6376 | -2.6742 |
|
83 |
+
| 0.6173 | 0.52 | 2000 | 0.6121 | -1.1535 | -1.4470 | 0.6575 | 0.2934 | -496.3810 | -511.5793 | -2.6232 | -2.6580 |
|
84 |
+
| 0.62 | 0.55 | 2100 | 0.6116 | -1.1600 | -1.4523 | 0.6650 | 0.2923 | -496.9117 | -512.2247 | -2.6278 | -2.6629 |
|
85 |
+
| 0.5957 | 0.58 | 2200 | 0.6132 | -0.9592 | -1.2431 | 0.6655 | 0.2839 | -475.9958 | -492.1489 | -2.6317 | -2.6674 |
|
86 |
+
| 0.6093 | 0.6 | 2300 | 0.6138 | -1.0935 | -1.3811 | 0.6625 | 0.2876 | -489.7906 | -505.5738 | -2.6283 | -2.6619 |
|
87 |
+
| 0.6009 | 0.63 | 2400 | 0.6108 | -1.0519 | -1.3479 | 0.6610 | 0.2959 | -486.4695 | -501.4175 | -2.6088 | -2.6432 |
|
88 |
+
| 0.5988 | 0.65 | 2500 | 0.6108 | -1.0427 | -1.3419 | 0.6590 | 0.2992 | -485.8730 | -500.4982 | -2.6143 | -2.6477 |
|
89 |
+
| 0.606 | 0.68 | 2600 | 0.6112 | -1.0188 | -1.3192 | 0.6545 | 0.3003 | -483.6013 | -498.1078 | -2.5974 | -2.6304 |
|
90 |
+
| 0.6118 | 0.71 | 2700 | 0.6106 | -1.0808 | -1.3857 | 0.6595 | 0.3049 | -490.2562 | -504.3045 | -2.5945 | -2.6274 |
|
91 |
+
| 0.6134 | 0.73 | 2800 | 0.6096 | -1.1549 | -1.4635 | 0.6585 | 0.3086 | -498.0366 | -511.7179 | -2.5978 | -2.6303 |
|
92 |
+
| 0.6159 | 0.76 | 2900 | 0.6097 | -1.0550 | -1.3509 | 0.6585 | 0.2959 | -486.7739 | -501.7256 | -2.6175 | -2.6500 |
|
93 |
+
| 0.5815 | 0.79 | 3000 | 0.6091 | -1.1025 | -1.4048 | 0.6570 | 0.3023 | -492.1650 | -506.4727 | -2.6089 | -2.6420 |
|
94 |
+
| 0.5885 | 0.81 | 3100 | 0.6089 | -1.0977 | -1.4006 | 0.6595 | 0.3029 | -491.7444 | -505.9960 | -2.6001 | -2.6337 |
|
95 |
+
| 0.6074 | 0.84 | 3200 | 0.6086 | -1.0982 | -1.4029 | 0.6605 | 0.3047 | -491.9724 | -506.0455 | -2.6056 | -2.6388 |
|
96 |
+
| 0.5981 | 0.86 | 3300 | 0.6087 | -1.0853 | -1.3881 | 0.6610 | 0.3028 | -490.4915 | -504.7571 | -2.6117 | -2.6442 |
|
97 |
+
| 0.5944 | 0.89 | 3400 | 0.6087 | -1.0897 | -1.3931 | 0.6580 | 0.3034 | -490.9887 | -505.1947 | -2.6026 | -2.6360 |
|
98 |
+
| 0.5979 | 0.92 | 3500 | 0.6085 | -1.0922 | -1.3962 | 0.6595 | 0.3040 | -491.3070 | -505.4438 | -2.6136 | -2.6460 |
|
99 |
+
| 0.6154 | 0.94 | 3600 | 0.6086 | -1.0905 | -1.3946 | 0.6595 | 0.3040 | -491.1413 | -505.2781 | -2.6066 | -2.6397 |
|
100 |
+
| 0.6053 | 0.97 | 3700 | 0.6086 | -1.0907 | -1.3946 | 0.6550 | 0.3039 | -491.1405 | -505.2943 | -2.6094 | -2.6423 |
|
101 |
+
| 0.602 | 0.99 | 3800 | 0.6085 | -1.0876 | -1.3914 | 0.6580 | 0.3038 | -490.8211 | -504.9807 | -2.6096 | -2.6425 |
|
102 |
+
|
103 |
+
|
104 |
+
### Framework versions
|
105 |
+
|
106 |
+
- PEFT 0.7.1
|
107 |
+
- Transformers 4.39.3
|
108 |
+
- Pytorch 2.1.2
|
109 |
+
- Datasets 2.18.0
|
110 |
+
- Tokenizers 0.15.2
|
adapter_model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 201892728
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b011251198adf4be2eea66558165d4087c69b22538e7c24f24de982f5dc1b4a9
|
3 |
size 201892728
|
all_results.json
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"epoch": 1.0,
|
3 |
+
"train_loss": 0.6288731582999011,
|
4 |
+
"train_runtime": 37165.2285,
|
5 |
+
"train_samples": 61134,
|
6 |
+
"train_samples_per_second": 1.645,
|
7 |
+
"train_steps_per_second": 0.103
|
8 |
+
}
|
runs/Apr24_01-31-15_poseidon/events.out.tfevents.1713922305.poseidon.732971.0
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3ca3481bafbf60e98dfc20469b03b8f791af91614b0b7cddb8e2ff9c953ba548
|
3 |
+
size 297188
|
train_results.json
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"epoch": 1.0,
|
3 |
+
"train_loss": 0.6288731582999011,
|
4 |
+
"train_runtime": 37165.2285,
|
5 |
+
"train_samples": 61134,
|
6 |
+
"train_samples_per_second": 1.645,
|
7 |
+
"train_steps_per_second": 0.103
|
8 |
+
}
|
trainer_state.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|