yalhessi commited on
Commit
00f3231
·
verified ·
1 Parent(s): b9487c5

End of training

Browse files
Files changed (4) hide show
  1. README.md +32 -2
  2. adapter_model.safetensors +1 -1
  3. loss_plot.png +0 -0
  4. training_args.bin +1 -1
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.1734
20
 
21
  ## Model description
22
 
@@ -45,7 +45,7 @@ The following hyperparameters were used during training:
45
  - total_eval_batch_size: 16
46
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
47
  - lr_scheduler_type: linear
48
- - num_epochs: 12
49
  - mixed_precision_training: Native AMP
50
 
51
  ### Training results
@@ -111,6 +111,36 @@ The following hyperparameters were used during training:
111
  | 0.1566 | 11.4016 | 165414 | 0.1770 |
112
  | 0.1563 | 11.6016 | 168316 | 0.1738 |
113
  | 0.1549 | 11.8016 | 171218 | 0.1734 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
114
 
115
 
116
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.1692
20
 
21
  ## Model description
22
 
 
45
  - total_eval_batch_size: 16
46
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
47
  - lr_scheduler_type: linear
48
+ - num_epochs: 18
49
  - mixed_precision_training: Native AMP
50
 
51
  ### Training results
 
111
  | 0.1566 | 11.4016 | 165414 | 0.1770 |
112
  | 0.1563 | 11.6016 | 168316 | 0.1738 |
113
  | 0.1549 | 11.8016 | 171218 | 0.1734 |
114
+ | 0.1568 | 12.0017 | 174120 | 0.1779 |
115
+ | 0.1814 | 12.2017 | 177022 | 0.1961 |
116
+ | 0.1855 | 12.4017 | 179924 | 0.1945 |
117
+ | 0.1863 | 12.6017 | 182826 | 0.1942 |
118
+ | 0.186 | 12.8018 | 185728 | 0.1949 |
119
+ | 0.1855 | 13.0018 | 188630 | 0.1927 |
120
+ | 0.1791 | 13.2018 | 191532 | 0.1921 |
121
+ | 0.1785 | 13.4018 | 194434 | 0.1915 |
122
+ | 0.179 | 13.6019 | 197336 | 0.1902 |
123
+ | 0.1775 | 13.8019 | 200238 | 0.1895 |
124
+ | 0.1791 | 14.0019 | 203140 | 0.1873 |
125
+ | 0.169 | 14.2020 | 206042 | 0.1900 |
126
+ | 0.1719 | 14.4020 | 208944 | 0.1858 |
127
+ | 0.1698 | 14.6020 | 211846 | 0.1825 |
128
+ | 0.1707 | 14.8020 | 214748 | 0.1810 |
129
+ | 0.168 | 15.0021 | 217650 | 0.1814 |
130
+ | 0.1614 | 15.2021 | 220552 | 0.1810 |
131
+ | 0.1611 | 15.4021 | 223454 | 0.1780 |
132
+ | 0.1615 | 15.6022 | 226356 | 0.1768 |
133
+ | 0.1636 | 15.8022 | 229258 | 0.1769 |
134
+ | 0.1595 | 16.0022 | 232160 | 0.1771 |
135
+ | 0.1527 | 16.2022 | 235062 | 0.1749 |
136
+ | 0.1521 | 16.4023 | 237964 | 0.1738 |
137
+ | 0.1527 | 16.6023 | 240866 | 0.1727 |
138
+ | 0.1529 | 16.8023 | 243768 | 0.1713 |
139
+ | 0.1494 | 17.0023 | 246670 | 0.1721 |
140
+ | 0.1452 | 17.2024 | 249572 | 0.1719 |
141
+ | 0.1436 | 17.4024 | 252474 | 0.1699 |
142
+ | 0.1445 | 17.6024 | 255376 | 0.1696 |
143
+ | 0.1424 | 17.8025 | 258278 | 0.1692 |
144
 
145
 
146
  ### Framework versions
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4fe538ed48e03ef73df208ea98ad0edd6677afe9244928af321a739dc17c7757
3
  size 6304096
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d1f3ad3e47fc183f9af16d103ceacd82c920734cb9dc252c6eceb5ab5df51cd8
3
  size 6304096
loss_plot.png CHANGED
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a95bbcd9b847e6f84db4dbc79007a4cfe4fa7e533b83cc7ef69380a66d974730
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:30deb38333ef48ef6bb9c9eb59500ca52d27a2305a7967595b7b320e38211594
3
  size 5432