willtensora commited on
Commit
595ba88
·
verified ·
1 Parent(s): 08424fe

End of training

Browse files
Files changed (2) hide show
  1. README.md +45 -45
  2. pytorch_model.bin +1 -1
README.md CHANGED
@@ -76,7 +76,7 @@ xformers_attention: true
76
 
77
  This model is a fine-tuned version of [fxmarty/small-llama-testing](https://huggingface.co/fxmarty/small-llama-testing) on the None dataset.
78
  It achieves the following results on the evaluation set:
79
- - Loss: 6.0837
80
 
81
  ## Model description
82
 
@@ -113,50 +113,50 @@ The following hyperparameters were used during training:
113
  | Training Loss | Epoch | Step | Validation Loss |
114
  |:-------------:|:------:|:----:|:---------------:|
115
  | No log | 0.0112 | 1 | 10.4228 |
116
- | 10.127 | 0.2247 | 20 | 9.8631 |
117
- | 9.0392 | 0.4494 | 40 | 8.7402 |
118
- | 8.1127 | 0.6742 | 60 | 7.9189 |
119
- | 7.5512 | 0.8989 | 80 | 7.4579 |
120
- | 7.2769 | 1.1236 | 100 | 7.2771 |
121
- | 7.1384 | 1.3483 | 120 | 7.1768 |
122
- | 7.0578 | 1.5730 | 140 | 7.0576 |
123
- | 6.9566 | 1.7978 | 160 | 6.9379 |
124
- | 6.8784 | 2.0225 | 180 | 6.8207 |
125
- | 6.7025 | 2.2472 | 200 | 6.7209 |
126
- | 6.5909 | 2.4719 | 220 | 6.6360 |
127
- | 6.4977 | 2.6966 | 240 | 6.5571 |
128
- | 6.445 | 2.9213 | 260 | 6.4721 |
129
- | 6.2635 | 3.1461 | 280 | 6.4128 |
130
- | 6.2356 | 3.3708 | 300 | 6.3661 |
131
- | 6.2742 | 3.5955 | 320 | 6.3160 |
132
- | 6.2487 | 3.8202 | 340 | 6.2879 |
133
- | 6.175 | 4.0449 | 360 | 6.2412 |
134
- | 6.051 | 4.2697 | 380 | 6.2180 |
135
- | 6.0159 | 4.4944 | 400 | 6.1959 |
136
- | 6.004 | 4.7191 | 420 | 6.1758 |
137
- | 6.0235 | 4.9438 | 440 | 6.1537 |
138
- | 5.9599 | 5.1685 | 460 | 6.1438 |
139
- | 6.0208 | 5.3933 | 480 | 6.1282 |
140
- | 5.9372 | 5.6180 | 500 | 6.1229 |
141
- | 5.8251 | 5.8427 | 520 | 6.1083 |
142
- | 5.875 | 6.0674 | 540 | 6.1066 |
143
- | 5.8374 | 6.2921 | 560 | 6.1013 |
144
- | 5.8515 | 6.5169 | 580 | 6.0970 |
145
- | 5.9295 | 6.7416 | 600 | 6.1001 |
146
- | 5.8929 | 6.9663 | 620 | 6.0978 |
147
- | 5.8917 | 7.1910 | 640 | 6.0958 |
148
- | 5.8256 | 7.4157 | 660 | 6.0939 |
149
- | 5.8201 | 7.6404 | 680 | 6.0887 |
150
- | 5.8619 | 7.8652 | 700 | 6.0871 |
151
- | 5.8934 | 8.0899 | 720 | 6.0946 |
152
- | 5.868 | 8.3146 | 740 | 6.0894 |
153
- | 5.9062 | 8.5393 | 760 | 6.0878 |
154
- | 5.8131 | 8.7640 | 780 | 6.0883 |
155
- | 5.7868 | 8.9888 | 800 | 6.0904 |
156
- | 5.8077 | 9.2135 | 820 | 6.0926 |
157
- | 5.8301 | 9.4382 | 840 | 6.0845 |
158
- | 5.8489 | 9.6629 | 860 | 6.0890 |
159
- | 5.8501 | 9.8876 | 880 | 6.0837 |
160
 
161
 
162
  ### Framework versions
 
76
 
77
  This model is a fine-tuned version of [fxmarty/small-llama-testing](https://huggingface.co/fxmarty/small-llama-testing) on the None dataset.
78
  It achieves the following results on the evaluation set:
79
+ - Loss: 6.0841
80
 
81
  ## Model description
82
 
 
113
  | Training Loss | Epoch | Step | Validation Loss |
114
  |:-------------:|:------:|:----:|:---------------:|
115
  | No log | 0.0112 | 1 | 10.4228 |
116
+ | 10.127 | 0.2247 | 20 | 9.8630 |
117
+ | 9.039 | 0.4494 | 40 | 8.7400 |
118
+ | 8.1126 | 0.6742 | 60 | 7.9190 |
119
+ | 7.5515 | 0.8989 | 80 | 7.4582 |
120
+ | 7.2771 | 1.1236 | 100 | 7.2773 |
121
+ | 7.1388 | 1.3483 | 120 | 7.1772 |
122
+ | 7.0583 | 1.5730 | 140 | 7.0581 |
123
+ | 6.957 | 1.7978 | 160 | 6.9383 |
124
+ | 6.8789 | 2.0225 | 180 | 6.8210 |
125
+ | 6.7029 | 2.2472 | 200 | 6.7213 |
126
+ | 6.5913 | 2.4719 | 220 | 6.6364 |
127
+ | 6.4981 | 2.6966 | 240 | 6.5572 |
128
+ | 6.4453 | 2.9213 | 260 | 6.4724 |
129
+ | 6.2642 | 3.1461 | 280 | 6.4135 |
130
+ | 6.2365 | 3.3708 | 300 | 6.3668 |
131
+ | 6.2746 | 3.5955 | 320 | 6.3166 |
132
+ | 6.2488 | 3.8202 | 340 | 6.2882 |
133
+ | 6.1749 | 4.0449 | 360 | 6.2413 |
134
+ | 6.0514 | 4.2697 | 380 | 6.2183 |
135
+ | 6.0162 | 4.4944 | 400 | 6.1961 |
136
+ | 6.0043 | 4.7191 | 420 | 6.1763 |
137
+ | 6.0239 | 4.9438 | 440 | 6.1544 |
138
+ | 5.9605 | 5.1685 | 460 | 6.1444 |
139
+ | 6.0214 | 5.3933 | 480 | 6.1288 |
140
+ | 5.9376 | 5.6180 | 500 | 6.1234 |
141
+ | 5.8258 | 5.8427 | 520 | 6.1087 |
142
+ | 5.8755 | 6.0674 | 540 | 6.1070 |
143
+ | 5.838 | 6.2921 | 560 | 6.1017 |
144
+ | 5.852 | 6.5169 | 580 | 6.0974 |
145
+ | 5.9298 | 6.7416 | 600 | 6.1005 |
146
+ | 5.8933 | 6.9663 | 620 | 6.0981 |
147
+ | 5.8921 | 7.1910 | 640 | 6.0962 |
148
+ | 5.8261 | 7.4157 | 660 | 6.0942 |
149
+ | 5.8207 | 7.6404 | 680 | 6.0891 |
150
+ | 5.8623 | 7.8652 | 700 | 6.0875 |
151
+ | 5.8938 | 8.0899 | 720 | 6.0950 |
152
+ | 5.8684 | 8.3146 | 740 | 6.0897 |
153
+ | 5.9065 | 8.5393 | 760 | 6.0881 |
154
+ | 5.8135 | 8.7640 | 780 | 6.0886 |
155
+ | 5.7873 | 8.9888 | 800 | 6.0908 |
156
+ | 5.8081 | 9.2135 | 820 | 6.0929 |
157
+ | 5.8307 | 9.4382 | 840 | 6.0849 |
158
+ | 5.8492 | 9.6629 | 860 | 6.0893 |
159
+ | 5.8504 | 9.8876 | 880 | 6.0841 |
160
 
161
 
162
  ### Framework versions
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:792a629f10a6767c348ead6e69da42089e762a50be74f5c86c6ca4960d976065
3
  size 34219693
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e0ca7b52c986706bf756645d07c30afd308c503a6107eb9b2a833f383cecd660
3
  size 34219693