Rakuto commited on
Commit
46008d7
·
verified ·
1 Parent(s): 291bdc3

End of training

Browse files
README.md CHANGED
@@ -18,18 +18,18 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.8915
22
- - Rewards/chosen: -0.0850
23
- - Rewards/rejected: -0.1471
24
  - Rewards/accuracies: 1.0
25
- - Rewards/margins: 0.0620
26
- - Logps/rejected: -1.4709
27
- - Logps/chosen: -0.8505
28
- - Logits/rejected: -1.4712
29
- - Logits/chosen: -1.3832
30
- - Nll Loss: 0.9223
31
- - Log Odds Ratio: -0.3568
32
- - Log Odds Chosen: 0.9048
33
 
34
  ## Model description
35
 
@@ -49,42 +49,55 @@ More information needed
49
 
50
  The following hyperparameters were used during training:
51
  - learning_rate: 8e-06
52
- - train_batch_size: 4
53
- - eval_batch_size: 4
54
  - seed: 42
55
- - distributed_type: multi-GPU
56
- - num_devices: 2
57
- - gradient_accumulation_steps: 4
58
- - total_train_batch_size: 32
59
- - total_eval_batch_size: 8
60
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
61
  - lr_scheduler_type: linear
62
- - num_epochs: 15
63
 
64
  ### Training results
65
 
66
- | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
67
- |:-------------:|:-------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
68
- | No log | 0.9231 | 3 | 3.1180 | -0.2042 | -0.2350 | 1.0 | 0.0308 | -2.3500 | -2.0422 | -0.3548 | -0.2877 | 3.2236 | -0.5406 | 0.3484 |
69
- | No log | 1.8462 | 6 | 2.4173 | -0.1672 | -0.2020 | 1.0 | 0.0348 | -2.0201 | -1.6718 | -0.7942 | -0.7314 | 2.5066 | -0.5118 | 0.4131 |
70
- | No log | 2.7692 | 9 | 1.8963 | -0.1456 | -0.1843 | 1.0 | 0.0387 | -1.8430 | -1.4562 | -1.1387 | -1.0656 | 1.9719 | -0.4870 | 0.4769 |
71
- | 2.7866 | 3.6923 | 12 | 1.5463 | -0.1295 | -0.1719 | 1.0 | 0.0424 | -1.7186 | -1.2949 | -1.2124 | -1.1379 | 1.6132 | -0.4640 | 0.5419 |
72
- | 2.7866 | 4.6154 | 15 | 1.3213 | -0.1167 | -0.1620 | 1.0 | 0.0453 | -1.6199 | -1.1667 | -1.2722 | -1.1925 | 1.3803 | -0.4448 | 0.6009 |
73
- | 2.7866 | 5.5385 | 18 | 1.1909 | -0.1070 | -0.1556 | 1.0 | 0.0486 | -1.5558 | -1.0697 | -1.2959 | -1.2153 | 1.2451 | -0.4253 | 0.6636 |
74
- | 1.2163 | 6.4615 | 21 | 1.0941 | -0.0991 | -0.1514 | 1.0 | 0.0522 | -1.5137 | -0.9913 | -1.3331 | -1.2576 | 1.1398 | -0.4056 | 0.7311 |
75
- | 1.2163 | 7.3846 | 24 | 1.0280 | -0.0940 | -0.1491 | 1.0 | 0.0552 | -1.4912 | -0.9396 | -1.3397 | -1.2570 | 1.0687 | -0.3904 | 0.7840 |
76
- | 1.2163 | 8.3077 | 27 | 0.9800 | -0.0906 | -0.1479 | 1.0 | 0.0573 | -1.4791 | -0.9060 | -1.3772 | -1.2975 | 1.0183 | -0.3796 | 0.8225 |
77
- | 0.8526 | 9.2308 | 30 | 0.9480 | -0.0884 | -0.1475 | 1.0 | 0.0591 | -1.4746 | -0.8835 | -1.4089 | -1.3263 | 0.9836 | -0.3707 | 0.8542 |
78
- | 0.8526 | 10.1538 | 33 | 0.9230 | -0.0869 | -0.1470 | 1.0 | 0.0600 | -1.4695 | -0.8691 | -1.4306 | -1.3510 | 0.9565 | -0.3656 | 0.8716 |
79
- | 0.8526 | 11.0769 | 36 | 0.9042 | -0.0857 | -0.1470 | 1.0 | 0.0613 | -1.4703 | -0.8569 | -1.4566 | -1.3714 | 0.9358 | -0.3599 | 0.8933 |
80
- | 0.8526 | 12.0 | 39 | 0.8960 | -0.0852 | -0.1469 | 1.0 | 0.0617 | -1.4695 | -0.8524 | -1.4646 | -1.3772 | 0.9270 | -0.3583 | 0.8997 |
81
- | 0.7594 | 12.9231 | 42 | 0.8919 | -0.0850 | -0.1469 | 1.0 | 0.0619 | -1.4690 | -0.8498 | -1.4733 | -1.3842 | 0.9230 | -0.3571 | 0.9036 |
82
- | 0.7594 | 13.8462 | 45 | 0.8915 | -0.0850 | -0.1471 | 1.0 | 0.0620 | -1.4709 | -0.8505 | -1.4712 | -1.3832 | 0.9223 | -0.3568 | 0.9048 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
 
84
 
85
  ### Framework versions
86
 
87
  - Transformers 4.44.2
88
- - Pytorch 2.2.0+cu121
89
  - Datasets 3.0.0
90
  - Tokenizers 0.19.1
 
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.0975
22
+ - Rewards/chosen: -0.0396
23
+ - Rewards/rejected: -0.4676
24
  - Rewards/accuracies: 1.0
25
+ - Rewards/margins: 0.4279
26
+ - Logps/rejected: -4.6755
27
+ - Logps/chosen: -0.3962
28
+ - Logits/rejected: -0.4699
29
+ - Logits/chosen: -0.4486
30
+ - Nll Loss: 0.0962
31
+ - Log Odds Ratio: -0.0125
32
+ - Log Odds Chosen: 5.5764
33
 
34
  ## Model description
35
 
 
49
 
50
  The following hyperparameters were used during training:
51
  - learning_rate: 8e-06
52
+ - train_batch_size: 2
53
+ - eval_batch_size: 2
54
  - seed: 42
55
+ - gradient_accumulation_steps: 2
56
+ - total_train_batch_size: 4
 
 
 
57
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
58
  - lr_scheduler_type: linear
59
+ - num_epochs: 10
60
 
61
  ### Training results
62
 
63
+ | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
64
+ |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
65
+ | 2.6683 | 0.3175 | 10 | 1.6958 | -0.1593 | -0.1930 | 1.0 | 0.0337 | -1.9297 | -1.5929 | -0.1829 | -0.1629 | 1.6445 | -0.5135 | 0.4079 |
66
+ | 1.2542 | 0.6349 | 20 | 0.6070 | -0.1099 | -0.1558 | 1.0 | 0.0459 | -1.5585 | -1.0992 | -0.6100 | -0.5897 | 0.5637 | -0.4328 | 0.6314 |
67
+ | 0.4588 | 0.9524 | 30 | 0.2601 | -0.0833 | -0.1420 | 1.0 | 0.0587 | -1.4205 | -0.8335 | -0.6077 | -0.5806 | 0.2247 | -0.3531 | 0.8956 |
68
+ | 0.2048 | 1.2698 | 40 | 0.2038 | -0.0707 | -0.1643 | 1.0 | 0.0936 | -1.6427 | -0.7072 | -0.4567 | -0.4212 | 0.1807 | -0.2310 | 1.4245 |
69
+ | 0.2328 | 1.5873 | 50 | 0.1706 | -0.0627 | -0.2262 | 1.0 | 0.1635 | -2.2622 | -0.6271 | -0.5442 | -0.4956 | 0.1594 | -0.1123 | 2.3222 |
70
+ | 0.1908 | 1.9048 | 60 | 0.1504 | -0.0563 | -0.2841 | 1.0 | 0.2278 | -2.8407 | -0.5630 | -0.7924 | -0.7412 | 0.1443 | -0.0613 | 3.1099 |
71
+ | 0.1374 | 2.2222 | 70 | 0.1419 | -0.0541 | -0.3223 | 1.0 | 0.2682 | -3.2228 | -0.5409 | -0.5342 | -0.4924 | 0.1374 | -0.0447 | 3.5674 |
72
+ | 0.1295 | 2.5397 | 80 | 0.1353 | -0.0521 | -0.3585 | 1.0 | 0.3064 | -3.5854 | -0.5211 | -0.2921 | -0.2549 | 0.1320 | -0.0333 | 3.9906 |
73
+ | 0.1346 | 2.8571 | 90 | 0.1268 | -0.0485 | -0.3594 | 1.0 | 0.3109 | -3.5940 | -0.4848 | -0.6749 | -0.6326 | 0.1236 | -0.0314 | 4.1171 |
74
+ | 0.1066 | 3.1746 | 100 | 0.1205 | -0.0467 | -0.3698 | 1.0 | 0.3231 | -3.6984 | -0.4669 | -0.6432 | -0.6000 | 0.1178 | -0.0277 | 4.2827 |
75
+ | 0.0897 | 3.4921 | 110 | 0.1166 | -0.0456 | -0.3783 | 1.0 | 0.3326 | -3.7826 | -0.4562 | -0.6700 | -0.6309 | 0.1140 | -0.0255 | 4.3985 |
76
+ | 0.1036 | 3.8095 | 120 | 0.1142 | -0.0447 | -0.4046 | 1.0 | 0.3599 | -4.0459 | -0.4468 | -0.5417 | -0.5106 | 0.1121 | -0.0207 | 4.6892 |
77
+ | 0.1117 | 4.1270 | 130 | 0.1103 | -0.0433 | -0.4254 | 1.0 | 0.3821 | -4.2539 | -0.4334 | -0.4460 | -0.4096 | 0.1086 | -0.0174 | 4.9551 |
78
+ | 0.0763 | 4.4444 | 140 | 0.1060 | -0.0421 | -0.4359 | 1.0 | 0.3938 | -4.3594 | -0.4209 | -0.5615 | -0.5271 | 0.1045 | -0.0156 | 5.0922 |
79
+ | 0.0732 | 4.7619 | 150 | 0.1058 | -0.0421 | -0.4417 | 1.0 | 0.3996 | -4.4165 | -0.4209 | -0.5553 | -0.5250 | 0.1043 | -0.0150 | 5.1857 |
80
+ | 0.0926 | 5.0794 | 160 | 0.1029 | -0.0410 | -0.4408 | 1.0 | 0.3999 | -4.4082 | -0.4096 | -0.4512 | -0.4244 | 0.1014 | -0.0151 | 5.2026 |
81
+ | 0.0624 | 5.3968 | 170 | 0.1025 | -0.0409 | -0.4495 | 1.0 | 0.4086 | -4.4950 | -0.4090 | -0.4781 | -0.4570 | 0.1010 | -0.0144 | 5.2920 |
82
+ | 0.0781 | 5.7143 | 180 | 0.1021 | -0.0409 | -0.4554 | 1.0 | 0.4145 | -4.5536 | -0.4087 | -0.5630 | -0.5409 | 0.1008 | -0.0137 | 5.3576 |
83
+ | 0.0833 | 6.0317 | 190 | 0.0996 | -0.0401 | -0.4499 | 1.0 | 0.4098 | -4.4989 | -0.4006 | -0.4745 | -0.4485 | 0.0982 | -0.0141 | 5.3305 |
84
+ | 0.0541 | 6.3492 | 200 | 0.0998 | -0.0402 | -0.4648 | 1.0 | 0.4246 | -4.6481 | -0.4022 | -0.4330 | -0.4110 | 0.0985 | -0.0128 | 5.5083 |
85
+ | 0.0715 | 6.6667 | 210 | 0.0972 | -0.0392 | -0.4586 | 1.0 | 0.4194 | -4.5862 | -0.3918 | -0.5684 | -0.5454 | 0.0959 | -0.0129 | 5.4637 |
86
+ | 0.066 | 6.9841 | 220 | 0.0998 | -0.0406 | -0.4732 | 1.0 | 0.4325 | -4.7317 | -0.4062 | -0.4060 | -0.3847 | 0.0986 | -0.0121 | 5.5880 |
87
+ | 0.058 | 7.3016 | 230 | 0.0979 | -0.0397 | -0.4643 | 1.0 | 0.4246 | -4.6434 | -0.3972 | -0.4258 | -0.4043 | 0.0966 | -0.0127 | 5.5285 |
88
+ | 0.058 | 7.6190 | 240 | 0.0974 | -0.0395 | -0.4618 | 1.0 | 0.4224 | -4.6183 | -0.3945 | -0.4604 | -0.4380 | 0.0961 | -0.0129 | 5.5167 |
89
+ | 0.072 | 7.9365 | 250 | 0.0966 | -0.0392 | -0.4607 | 1.0 | 0.4215 | -4.6070 | -0.3916 | -0.5075 | -0.4844 | 0.0953 | -0.0129 | 5.5101 |
90
+ | 0.0636 | 8.2540 | 260 | 0.0969 | -0.0392 | -0.4642 | 1.0 | 0.4249 | -4.6417 | -0.3922 | -0.5018 | -0.4805 | 0.0957 | -0.0127 | 5.5503 |
91
+ | 0.0501 | 8.5714 | 270 | 0.0973 | -0.0395 | -0.4655 | 1.0 | 0.4261 | -4.6554 | -0.3949 | -0.4806 | -0.4591 | 0.0960 | -0.0127 | 5.5554 |
92
+ | 0.0634 | 8.8889 | 280 | 0.0973 | -0.0396 | -0.4663 | 1.0 | 0.4267 | -4.6630 | -0.3957 | -0.4770 | -0.4557 | 0.0960 | -0.0126 | 5.5671 |
93
+ | 0.0547 | 9.2063 | 290 | 0.0973 | -0.0395 | -0.4663 | 1.0 | 0.4268 | -4.6630 | -0.3950 | -0.4694 | -0.4479 | 0.0960 | -0.0127 | 5.5673 |
94
+ | 0.072 | 9.5238 | 300 | 0.0972 | -0.0395 | -0.4671 | 1.0 | 0.4276 | -4.6709 | -0.3951 | -0.4752 | -0.4533 | 0.0960 | -0.0126 | 5.5766 |
95
+ | 0.0508 | 9.8413 | 310 | 0.0975 | -0.0396 | -0.4676 | 1.0 | 0.4279 | -4.6755 | -0.3962 | -0.4699 | -0.4486 | 0.0962 | -0.0125 | 5.5764 |
96
 
97
 
98
  ### Framework versions
99
 
100
  - Transformers 4.44.2
101
+ - Pytorch 2.4.1+cu121
102
  - Datasets 3.0.0
103
  - Tokenizers 0.19.1
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e7033dd4e58decd2dfe4545c763214f5571d778073671a1eadb6f4a8574bb488
3
  size 4965811384
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e5d5aab14184e2968d95b5725229c87ad40c8ec3a92f0d73c1fd138f8ef53a50
3
  size 4965811384
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:99915d7c9148040728275b3de567b0b26aedb33650cccc994f97b31dbff768a5
3
  size 1459729952
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:968c134a2fc6d77e5b2d0b620f4d7db33fb93436b78dd31d8149d95d199c56aa
3
  size 1459729952
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bc9dd47dc376ffb2e7bff7f25092b8eb91359a3957f4092ef5f084f3f33915cc
3
  size 5496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0fc4362a98b5bebb0bf0a25ebb1ec27eaf959249d885baa5568a26772c2f9a80
3
  size 5496