End of training
Browse files- README.md +50 -37
- model-00001-of-00002.safetensors +1 -1
- model-00002-of-00002.safetensors +1 -1
- training_args.bin +1 -1
README.md
CHANGED
@@ -18,18 +18,18 @@ should probably proofread and complete it, then remove this comment. -->
|
|
18 |
|
19 |
This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on an unknown dataset.
|
20 |
It achieves the following results on the evaluation set:
|
21 |
-
- Loss: 0.
|
22 |
-
- Rewards/chosen: -0.
|
23 |
-
- Rewards/rejected: -0.
|
24 |
- Rewards/accuracies: 1.0
|
25 |
-
- Rewards/margins: 0.
|
26 |
-
- Logps/rejected: -
|
27 |
-
- Logps/chosen: -0.
|
28 |
-
- Logits/rejected: -
|
29 |
-
- Logits/chosen: -
|
30 |
-
- Nll Loss: 0.
|
31 |
-
- Log Odds Ratio: -0.
|
32 |
-
- Log Odds Chosen:
|
33 |
|
34 |
## Model description
|
35 |
|
@@ -49,42 +49,55 @@ More information needed
|
|
49 |
|
50 |
The following hyperparameters were used during training:
|
51 |
- learning_rate: 8e-06
|
52 |
-
- train_batch_size:
|
53 |
-
- eval_batch_size:
|
54 |
- seed: 42
|
55 |
-
-
|
56 |
-
-
|
57 |
-
- gradient_accumulation_steps: 4
|
58 |
-
- total_train_batch_size: 32
|
59 |
-
- total_eval_batch_size: 8
|
60 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
61 |
- lr_scheduler_type: linear
|
62 |
-
- num_epochs:
|
63 |
|
64 |
### Training results
|
65 |
|
66 |
-
| Training Loss | Epoch
|
67 |
-
|
68 |
-
|
|
69 |
-
|
|
70 |
-
|
|
71 |
-
|
|
72 |
-
|
|
73 |
-
|
|
74 |
-
|
|
75 |
-
|
|
76 |
-
|
|
77 |
-
| 0.
|
78 |
-
| 0.
|
79 |
-
| 0.
|
80 |
-
| 0.
|
81 |
-
| 0.
|
82 |
-
| 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
83 |
|
84 |
|
85 |
### Framework versions
|
86 |
|
87 |
- Transformers 4.44.2
|
88 |
-
- Pytorch 2.
|
89 |
- Datasets 3.0.0
|
90 |
- Tokenizers 0.19.1
|
|
|
18 |
|
19 |
This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on an unknown dataset.
|
20 |
It achieves the following results on the evaluation set:
|
21 |
+
- Loss: 0.0975
|
22 |
+
- Rewards/chosen: -0.0396
|
23 |
+
- Rewards/rejected: -0.4676
|
24 |
- Rewards/accuracies: 1.0
|
25 |
+
- Rewards/margins: 0.4279
|
26 |
+
- Logps/rejected: -4.6755
|
27 |
+
- Logps/chosen: -0.3962
|
28 |
+
- Logits/rejected: -0.4699
|
29 |
+
- Logits/chosen: -0.4486
|
30 |
+
- Nll Loss: 0.0962
|
31 |
+
- Log Odds Ratio: -0.0125
|
32 |
+
- Log Odds Chosen: 5.5764
|
33 |
|
34 |
## Model description
|
35 |
|
|
|
49 |
|
50 |
The following hyperparameters were used during training:
|
51 |
- learning_rate: 8e-06
|
52 |
+
- train_batch_size: 2
|
53 |
+
- eval_batch_size: 2
|
54 |
- seed: 42
|
55 |
+
- gradient_accumulation_steps: 2
|
56 |
+
- total_train_batch_size: 4
|
|
|
|
|
|
|
57 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
58 |
- lr_scheduler_type: linear
|
59 |
+
- num_epochs: 10
|
60 |
|
61 |
### Training results
|
62 |
|
63 |
+
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
|
64 |
+
|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
|
65 |
+
| 2.6683 | 0.3175 | 10 | 1.6958 | -0.1593 | -0.1930 | 1.0 | 0.0337 | -1.9297 | -1.5929 | -0.1829 | -0.1629 | 1.6445 | -0.5135 | 0.4079 |
|
66 |
+
| 1.2542 | 0.6349 | 20 | 0.6070 | -0.1099 | -0.1558 | 1.0 | 0.0459 | -1.5585 | -1.0992 | -0.6100 | -0.5897 | 0.5637 | -0.4328 | 0.6314 |
|
67 |
+
| 0.4588 | 0.9524 | 30 | 0.2601 | -0.0833 | -0.1420 | 1.0 | 0.0587 | -1.4205 | -0.8335 | -0.6077 | -0.5806 | 0.2247 | -0.3531 | 0.8956 |
|
68 |
+
| 0.2048 | 1.2698 | 40 | 0.2038 | -0.0707 | -0.1643 | 1.0 | 0.0936 | -1.6427 | -0.7072 | -0.4567 | -0.4212 | 0.1807 | -0.2310 | 1.4245 |
|
69 |
+
| 0.2328 | 1.5873 | 50 | 0.1706 | -0.0627 | -0.2262 | 1.0 | 0.1635 | -2.2622 | -0.6271 | -0.5442 | -0.4956 | 0.1594 | -0.1123 | 2.3222 |
|
70 |
+
| 0.1908 | 1.9048 | 60 | 0.1504 | -0.0563 | -0.2841 | 1.0 | 0.2278 | -2.8407 | -0.5630 | -0.7924 | -0.7412 | 0.1443 | -0.0613 | 3.1099 |
|
71 |
+
| 0.1374 | 2.2222 | 70 | 0.1419 | -0.0541 | -0.3223 | 1.0 | 0.2682 | -3.2228 | -0.5409 | -0.5342 | -0.4924 | 0.1374 | -0.0447 | 3.5674 |
|
72 |
+
| 0.1295 | 2.5397 | 80 | 0.1353 | -0.0521 | -0.3585 | 1.0 | 0.3064 | -3.5854 | -0.5211 | -0.2921 | -0.2549 | 0.1320 | -0.0333 | 3.9906 |
|
73 |
+
| 0.1346 | 2.8571 | 90 | 0.1268 | -0.0485 | -0.3594 | 1.0 | 0.3109 | -3.5940 | -0.4848 | -0.6749 | -0.6326 | 0.1236 | -0.0314 | 4.1171 |
|
74 |
+
| 0.1066 | 3.1746 | 100 | 0.1205 | -0.0467 | -0.3698 | 1.0 | 0.3231 | -3.6984 | -0.4669 | -0.6432 | -0.6000 | 0.1178 | -0.0277 | 4.2827 |
|
75 |
+
| 0.0897 | 3.4921 | 110 | 0.1166 | -0.0456 | -0.3783 | 1.0 | 0.3326 | -3.7826 | -0.4562 | -0.6700 | -0.6309 | 0.1140 | -0.0255 | 4.3985 |
|
76 |
+
| 0.1036 | 3.8095 | 120 | 0.1142 | -0.0447 | -0.4046 | 1.0 | 0.3599 | -4.0459 | -0.4468 | -0.5417 | -0.5106 | 0.1121 | -0.0207 | 4.6892 |
|
77 |
+
| 0.1117 | 4.1270 | 130 | 0.1103 | -0.0433 | -0.4254 | 1.0 | 0.3821 | -4.2539 | -0.4334 | -0.4460 | -0.4096 | 0.1086 | -0.0174 | 4.9551 |
|
78 |
+
| 0.0763 | 4.4444 | 140 | 0.1060 | -0.0421 | -0.4359 | 1.0 | 0.3938 | -4.3594 | -0.4209 | -0.5615 | -0.5271 | 0.1045 | -0.0156 | 5.0922 |
|
79 |
+
| 0.0732 | 4.7619 | 150 | 0.1058 | -0.0421 | -0.4417 | 1.0 | 0.3996 | -4.4165 | -0.4209 | -0.5553 | -0.5250 | 0.1043 | -0.0150 | 5.1857 |
|
80 |
+
| 0.0926 | 5.0794 | 160 | 0.1029 | -0.0410 | -0.4408 | 1.0 | 0.3999 | -4.4082 | -0.4096 | -0.4512 | -0.4244 | 0.1014 | -0.0151 | 5.2026 |
|
81 |
+
| 0.0624 | 5.3968 | 170 | 0.1025 | -0.0409 | -0.4495 | 1.0 | 0.4086 | -4.4950 | -0.4090 | -0.4781 | -0.4570 | 0.1010 | -0.0144 | 5.2920 |
|
82 |
+
| 0.0781 | 5.7143 | 180 | 0.1021 | -0.0409 | -0.4554 | 1.0 | 0.4145 | -4.5536 | -0.4087 | -0.5630 | -0.5409 | 0.1008 | -0.0137 | 5.3576 |
|
83 |
+
| 0.0833 | 6.0317 | 190 | 0.0996 | -0.0401 | -0.4499 | 1.0 | 0.4098 | -4.4989 | -0.4006 | -0.4745 | -0.4485 | 0.0982 | -0.0141 | 5.3305 |
|
84 |
+
| 0.0541 | 6.3492 | 200 | 0.0998 | -0.0402 | -0.4648 | 1.0 | 0.4246 | -4.6481 | -0.4022 | -0.4330 | -0.4110 | 0.0985 | -0.0128 | 5.5083 |
|
85 |
+
| 0.0715 | 6.6667 | 210 | 0.0972 | -0.0392 | -0.4586 | 1.0 | 0.4194 | -4.5862 | -0.3918 | -0.5684 | -0.5454 | 0.0959 | -0.0129 | 5.4637 |
|
86 |
+
| 0.066 | 6.9841 | 220 | 0.0998 | -0.0406 | -0.4732 | 1.0 | 0.4325 | -4.7317 | -0.4062 | -0.4060 | -0.3847 | 0.0986 | -0.0121 | 5.5880 |
|
87 |
+
| 0.058 | 7.3016 | 230 | 0.0979 | -0.0397 | -0.4643 | 1.0 | 0.4246 | -4.6434 | -0.3972 | -0.4258 | -0.4043 | 0.0966 | -0.0127 | 5.5285 |
|
88 |
+
| 0.058 | 7.6190 | 240 | 0.0974 | -0.0395 | -0.4618 | 1.0 | 0.4224 | -4.6183 | -0.3945 | -0.4604 | -0.4380 | 0.0961 | -0.0129 | 5.5167 |
|
89 |
+
| 0.072 | 7.9365 | 250 | 0.0966 | -0.0392 | -0.4607 | 1.0 | 0.4215 | -4.6070 | -0.3916 | -0.5075 | -0.4844 | 0.0953 | -0.0129 | 5.5101 |
|
90 |
+
| 0.0636 | 8.2540 | 260 | 0.0969 | -0.0392 | -0.4642 | 1.0 | 0.4249 | -4.6417 | -0.3922 | -0.5018 | -0.4805 | 0.0957 | -0.0127 | 5.5503 |
|
91 |
+
| 0.0501 | 8.5714 | 270 | 0.0973 | -0.0395 | -0.4655 | 1.0 | 0.4261 | -4.6554 | -0.3949 | -0.4806 | -0.4591 | 0.0960 | -0.0127 | 5.5554 |
|
92 |
+
| 0.0634 | 8.8889 | 280 | 0.0973 | -0.0396 | -0.4663 | 1.0 | 0.4267 | -4.6630 | -0.3957 | -0.4770 | -0.4557 | 0.0960 | -0.0126 | 5.5671 |
|
93 |
+
| 0.0547 | 9.2063 | 290 | 0.0973 | -0.0395 | -0.4663 | 1.0 | 0.4268 | -4.6630 | -0.3950 | -0.4694 | -0.4479 | 0.0960 | -0.0127 | 5.5673 |
|
94 |
+
| 0.072 | 9.5238 | 300 | 0.0972 | -0.0395 | -0.4671 | 1.0 | 0.4276 | -4.6709 | -0.3951 | -0.4752 | -0.4533 | 0.0960 | -0.0126 | 5.5766 |
|
95 |
+
| 0.0508 | 9.8413 | 310 | 0.0975 | -0.0396 | -0.4676 | 1.0 | 0.4279 | -4.6755 | -0.3962 | -0.4699 | -0.4486 | 0.0962 | -0.0125 | 5.5764 |
|
96 |
|
97 |
|
98 |
### Framework versions
|
99 |
|
100 |
- Transformers 4.44.2
|
101 |
+
- Pytorch 2.4.1+cu121
|
102 |
- Datasets 3.0.0
|
103 |
- Tokenizers 0.19.1
|
model-00001-of-00002.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4965811384
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e5d5aab14184e2968d95b5725229c87ad40c8ec3a92f0d73c1fd138f8ef53a50
|
3 |
size 4965811384
|
model-00002-of-00002.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1459729952
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:968c134a2fc6d77e5b2d0b620f4d7db33fb93436b78dd31d8149d95d199c56aa
|
3 |
size 1459729952
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5496
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0fc4362a98b5bebb0bf0a25ebb1ec27eaf959249d885baa5568a26772c2f9a80
|
3 |
size 5496
|