Anish13 commited on
Commit
1dd7067
·
verified ·
1 Parent(s): cec4708

Model save

Browse files
README.md ADDED
@@ -0,0 +1,406 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - generated_from_trainer
4
+ model-index:
5
+ - name: results_model3
6
+ results: []
7
+ ---
8
+
9
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
10
+ should probably proofread and complete it, then remove this comment. -->
11
+
12
+ # results_model3
13
+
14
+ This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
15
+ It achieves the following results on the evaluation set:
16
+ - Loss: 4.5574
17
+
18
+ ## Model description
19
+
20
+ More information needed
21
+
22
+ ## Intended uses & limitations
23
+
24
+ More information needed
25
+
26
+ ## Training and evaluation data
27
+
28
+ More information needed
29
+
30
+ ## Training procedure
31
+
32
+ ### Training hyperparameters
33
+
34
+ The following hyperparameters were used during training:
35
+ - learning_rate: 0.0001
36
+ - train_batch_size: 32
37
+ - eval_batch_size: 32
38
+ - seed: 42
39
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
40
+ - lr_scheduler_type: linear
41
+ - lr_scheduler_warmup_steps: 30
42
+ - num_epochs: 40
43
+ - mixed_precision_training: Native AMP
44
+
45
+ ### Training results
46
+
47
+ | Training Loss | Epoch | Step | Validation Loss |
48
+ |:-------------:|:-------:|:------:|:---------------:|
49
+ | No log | 0.1141 | 1024 | 6.0292 |
50
+ | No log | 0.2282 | 2048 | 5.6435 |
51
+ | No log | 0.3422 | 3072 | 5.3662 |
52
+ | No log | 0.4563 | 4096 | 5.1810 |
53
+ | No log | 0.5704 | 5120 | 5.0305 |
54
+ | No log | 0.6845 | 6144 | 4.9323 |
55
+ | No log | 0.7986 | 7168 | 4.8141 |
56
+ | No log | 0.9127 | 8192 | 4.7390 |
57
+ | 5.3905 | 1.0267 | 9216 | 4.7245 |
58
+ | 5.3905 | 1.1408 | 10240 | 4.5491 |
59
+ | 5.3905 | 1.2549 | 11264 | 4.5115 |
60
+ | 5.3905 | 1.3690 | 12288 | 4.5454 |
61
+ | 5.3905 | 1.4831 | 13312 | 4.4235 |
62
+ | 5.3905 | 1.5971 | 14336 | 4.3937 |
63
+ | 5.3905 | 1.7112 | 15360 | 4.3675 |
64
+ | 5.3905 | 1.8253 | 16384 | 4.3321 |
65
+ | 5.3905 | 1.9394 | 17408 | 4.3125 |
66
+ | 4.3288 | 2.0535 | 18432 | 4.2470 |
67
+ | 4.3288 | 2.1676 | 19456 | 4.3321 |
68
+ | 4.3288 | 2.2816 | 20480 | 4.2336 |
69
+ | 4.3288 | 2.3957 | 21504 | 4.2677 |
70
+ | 4.3288 | 2.5098 | 22528 | 4.2491 |
71
+ | 4.3288 | 2.6239 | 23552 | 4.3101 |
72
+ | 4.3288 | 2.7380 | 24576 | 4.3300 |
73
+ | 4.3288 | 2.8520 | 25600 | 4.3947 |
74
+ | 4.3288 | 2.9661 | 26624 | 4.3634 |
75
+ | 3.853 | 3.0802 | 27648 | 4.2981 |
76
+ | 3.853 | 3.1943 | 28672 | 4.4073 |
77
+ | 3.853 | 3.3084 | 29696 | 4.3586 |
78
+ | 3.853 | 3.4225 | 30720 | 4.5024 |
79
+ | 3.853 | 3.5365 | 31744 | 4.6206 |
80
+ | 3.853 | 3.6506 | 32768 | 4.5310 |
81
+ | 3.853 | 3.7647 | 33792 | 4.6789 |
82
+ | 3.853 | 3.8788 | 34816 | 4.5824 |
83
+ | 3.853 | 3.9929 | 35840 | 4.4508 |
84
+ | 3.5631 | 4.1070 | 36864 | 4.5873 |
85
+ | 3.5631 | 4.2210 | 37888 | 4.5861 |
86
+ | 3.5631 | 4.3351 | 38912 | 4.6930 |
87
+ | 3.5631 | 4.4492 | 39936 | 4.6392 |
88
+ | 3.5631 | 4.5633 | 40960 | 4.5271 |
89
+ | 3.5631 | 4.6774 | 41984 | 4.8197 |
90
+ | 3.5631 | 4.7914 | 43008 | 4.7696 |
91
+ | 3.5631 | 4.9055 | 44032 | 4.6841 |
92
+ | 3.3655 | 5.0196 | 45056 | 4.8238 |
93
+ | 3.3655 | 5.1337 | 46080 | 4.8119 |
94
+ | 3.3655 | 5.2478 | 47104 | 4.8520 |
95
+ | 3.3655 | 5.3619 | 48128 | 4.8713 |
96
+ | 3.3655 | 5.4759 | 49152 | 4.8548 |
97
+ | 3.3655 | 5.5900 | 50176 | 4.7750 |
98
+ | 3.3655 | 5.7041 | 51200 | 5.0075 |
99
+ | 3.3655 | 5.8182 | 52224 | 4.9843 |
100
+ | 3.3655 | 5.9323 | 53248 | 4.8895 |
101
+ | 3.2228 | 6.0463 | 54272 | 5.0877 |
102
+ | 3.2228 | 6.1604 | 55296 | 4.6181 |
103
+ | 3.2228 | 6.2745 | 56320 | 4.7398 |
104
+ | 3.2228 | 6.3886 | 57344 | 4.6617 |
105
+ | 3.2228 | 6.5027 | 58368 | 4.8633 |
106
+ | 3.2228 | 6.6168 | 59392 | 4.9870 |
107
+ | 3.2228 | 6.7308 | 60416 | 5.0021 |
108
+ | 3.2228 | 6.8449 | 61440 | 4.7422 |
109
+ | 3.2228 | 6.9590 | 62464 | 4.9250 |
110
+ | 3.1155 | 7.0731 | 63488 | 4.7348 |
111
+ | 3.1155 | 7.1872 | 64512 | 4.8952 |
112
+ | 3.1155 | 7.3012 | 65536 | 4.8318 |
113
+ | 3.1155 | 7.4153 | 66560 | 4.8476 |
114
+ | 3.1155 | 7.5294 | 67584 | 5.0057 |
115
+ | 3.1155 | 7.6435 | 68608 | 4.9427 |
116
+ | 3.1155 | 7.7576 | 69632 | 4.8623 |
117
+ | 3.1155 | 7.8717 | 70656 | 4.8452 |
118
+ | 3.1155 | 7.9857 | 71680 | 4.8021 |
119
+ | 3.0315 | 8.0998 | 72704 | 4.7947 |
120
+ | 3.0315 | 8.2139 | 73728 | 4.8884 |
121
+ | 3.0315 | 8.3280 | 74752 | 4.8440 |
122
+ | 3.0315 | 8.4421 | 75776 | 4.8452 |
123
+ | 3.0315 | 8.5561 | 76800 | 4.8929 |
124
+ | 3.0315 | 8.6702 | 77824 | 4.8412 |
125
+ | 3.0315 | 8.7843 | 78848 | 4.8926 |
126
+ | 3.0315 | 8.8984 | 79872 | 4.6749 |
127
+ | 2.9626 | 9.0125 | 80896 | 4.9204 |
128
+ | 2.9626 | 9.1266 | 81920 | 4.7448 |
129
+ | 2.9626 | 9.2406 | 82944 | 4.6707 |
130
+ | 2.9626 | 9.3547 | 83968 | 4.7409 |
131
+ | 2.9626 | 9.4688 | 84992 | 4.7350 |
132
+ | 2.9626 | 9.5829 | 86016 | 4.7875 |
133
+ | 2.9626 | 9.6970 | 87040 | 4.7588 |
134
+ | 2.9626 | 9.8111 | 88064 | 4.6576 |
135
+ | 2.9626 | 9.9251 | 89088 | 4.6979 |
136
+ | 2.9057 | 10.0392 | 90112 | 4.7358 |
137
+ | 2.9057 | 10.1533 | 91136 | 4.7766 |
138
+ | 2.9057 | 10.2674 | 92160 | 4.7182 |
139
+ | 2.9057 | 10.3815 | 93184 | 4.7157 |
140
+ | 2.9057 | 10.4955 | 94208 | 4.6437 |
141
+ | 2.9057 | 10.6096 | 95232 | 4.6506 |
142
+ | 2.9057 | 10.7237 | 96256 | 4.6468 |
143
+ | 2.9057 | 10.8378 | 97280 | 4.6038 |
144
+ | 2.9057 | 10.9519 | 98304 | 4.7661 |
145
+ | 2.857 | 11.0660 | 99328 | 4.7447 |
146
+ | 2.857 | 11.1800 | 100352 | 4.5949 |
147
+ | 2.857 | 11.2941 | 101376 | 4.6705 |
148
+ | 2.857 | 11.4082 | 102400 | 4.7022 |
149
+ | 2.857 | 11.5223 | 103424 | 4.6394 |
150
+ | 2.857 | 11.6364 | 104448 | 4.7558 |
151
+ | 2.857 | 11.7504 | 105472 | 4.7065 |
152
+ | 2.857 | 11.8645 | 106496 | 4.4721 |
153
+ | 2.857 | 11.9786 | 107520 | 4.6075 |
154
+ | 2.8152 | 12.0927 | 108544 | 4.6613 |
155
+ | 2.8152 | 12.2068 | 109568 | 4.6763 |
156
+ | 2.8152 | 12.3209 | 110592 | 4.5310 |
157
+ | 2.8152 | 12.4349 | 111616 | 4.6142 |
158
+ | 2.8152 | 12.5490 | 112640 | 4.5820 |
159
+ | 2.8152 | 12.6631 | 113664 | 4.6034 |
160
+ | 2.8152 | 12.7772 | 114688 | 4.6213 |
161
+ | 2.8152 | 12.8913 | 115712 | 4.6057 |
162
+ | 2.7789 | 13.0053 | 116736 | 4.6278 |
163
+ | 2.7789 | 13.1194 | 117760 | 4.5454 |
164
+ | 2.7789 | 13.2335 | 118784 | 4.6711 |
165
+ | 2.7789 | 13.3476 | 119808 | 4.5148 |
166
+ | 2.7789 | 13.4617 | 120832 | 4.5424 |
167
+ | 2.7789 | 13.5758 | 121856 | 4.5074 |
168
+ | 2.7789 | 13.6898 | 122880 | 4.5448 |
169
+ | 2.7789 | 13.8039 | 123904 | 4.5084 |
170
+ | 2.7789 | 13.9180 | 124928 | 4.4962 |
171
+ | 2.7457 | 14.0321 | 125952 | 4.5270 |
172
+ | 2.7457 | 14.1462 | 126976 | 4.3997 |
173
+ | 2.7457 | 14.2602 | 128000 | 4.5798 |
174
+ | 2.7457 | 14.3743 | 129024 | 4.5839 |
175
+ | 2.7457 | 14.4884 | 130048 | 4.5679 |
176
+ | 2.7457 | 14.6025 | 131072 | 4.4674 |
177
+ | 2.7457 | 14.7166 | 132096 | 4.4471 |
178
+ | 2.7457 | 14.8307 | 133120 | 4.3811 |
179
+ | 2.7457 | 14.9447 | 134144 | 4.4387 |
180
+ | 2.7165 | 15.0588 | 135168 | 4.4756 |
181
+ | 2.7165 | 15.1729 | 136192 | 4.5638 |
182
+ | 2.7165 | 15.2870 | 137216 | 4.4033 |
183
+ | 2.7165 | 15.4011 | 138240 | 4.4876 |
184
+ | 2.7165 | 15.5152 | 139264 | 4.3874 |
185
+ | 2.7165 | 15.6292 | 140288 | 4.4200 |
186
+ | 2.7165 | 15.7433 | 141312 | 4.5077 |
187
+ | 2.7165 | 15.8574 | 142336 | 4.4537 |
188
+ | 2.7165 | 15.9715 | 143360 | 4.4381 |
189
+ | 2.6895 | 16.0856 | 144384 | 4.5636 |
190
+ | 2.6895 | 16.1996 | 145408 | 4.3530 |
191
+ | 2.6895 | 16.3137 | 146432 | 4.3760 |
192
+ | 2.6895 | 16.4278 | 147456 | 4.4327 |
193
+ | 2.6895 | 16.5419 | 148480 | 4.3666 |
194
+ | 2.6895 | 16.6560 | 149504 | 4.3708 |
195
+ | 2.6895 | 16.7701 | 150528 | 4.3945 |
196
+ | 2.6895 | 16.8841 | 151552 | 4.3781 |
197
+ | 2.6895 | 16.9982 | 152576 | 4.4506 |
198
+ | 2.6652 | 17.1123 | 153600 | 4.3923 |
199
+ | 2.6652 | 17.2264 | 154624 | 4.4244 |
200
+ | 2.6652 | 17.3405 | 155648 | 4.4576 |
201
+ | 2.6652 | 17.4545 | 156672 | 4.5156 |
202
+ | 2.6652 | 17.5686 | 157696 | 4.4249 |
203
+ | 2.6652 | 17.6827 | 158720 | 4.3867 |
204
+ | 2.6652 | 17.7968 | 159744 | 4.4360 |
205
+ | 2.6652 | 17.9109 | 160768 | 4.4036 |
206
+ | 2.6426 | 18.0250 | 161792 | 4.3103 |
207
+ | 2.6426 | 18.1390 | 162816 | 4.4384 |
208
+ | 2.6426 | 18.2531 | 163840 | 4.4340 |
209
+ | 2.6426 | 18.3672 | 164864 | 4.4168 |
210
+ | 2.6426 | 18.4813 | 165888 | 4.3282 |
211
+ | 2.6426 | 18.5954 | 166912 | 4.3200 |
212
+ | 2.6426 | 18.7094 | 167936 | 4.2999 |
213
+ | 2.6426 | 18.8235 | 168960 | 4.4347 |
214
+ | 2.6426 | 18.9376 | 169984 | 4.4230 |
215
+ | 2.6219 | 19.0517 | 171008 | 4.4185 |
216
+ | 2.6219 | 19.1658 | 172032 | 4.3904 |
217
+ | 2.6219 | 19.2799 | 173056 | 4.4376 |
218
+ | 2.6219 | 19.3939 | 174080 | 4.3366 |
219
+ | 2.6219 | 19.5080 | 175104 | 4.4409 |
220
+ | 2.6219 | 19.6221 | 176128 | 4.3827 |
221
+ | 2.6219 | 19.7362 | 177152 | 4.4327 |
222
+ | 2.6219 | 19.8503 | 178176 | 4.4141 |
223
+ | 2.6219 | 19.9643 | 179200 | 4.4321 |
224
+ | 2.6027 | 20.0784 | 180224 | 4.2911 |
225
+ | 2.6027 | 20.1925 | 181248 | 4.3532 |
226
+ | 2.6027 | 20.3066 | 182272 | 4.3809 |
227
+ | 2.6027 | 20.4207 | 183296 | 4.3316 |
228
+ | 2.6027 | 20.5348 | 184320 | 4.4209 |
229
+ | 2.6027 | 20.6488 | 185344 | 4.4665 |
230
+ | 2.6027 | 20.7629 | 186368 | 4.4491 |
231
+ | 2.6027 | 20.8770 | 187392 | 4.5202 |
232
+ | 2.6027 | 20.9911 | 188416 | 4.3736 |
233
+ | 2.5844 | 21.1052 | 189440 | 4.3502 |
234
+ | 2.5844 | 21.2193 | 190464 | 4.4119 |
235
+ | 2.5844 | 21.3333 | 191488 | 4.5101 |
236
+ | 2.5844 | 21.4474 | 192512 | 4.4317 |
237
+ | 2.5844 | 21.5615 | 193536 | 4.4820 |
238
+ | 2.5844 | 21.6756 | 194560 | 4.3390 |
239
+ | 2.5844 | 21.7897 | 195584 | 4.5056 |
240
+ | 2.5844 | 21.9037 | 196608 | 4.3455 |
241
+ | 2.567 | 22.0178 | 197632 | 4.4092 |
242
+ | 2.567 | 22.1319 | 198656 | 4.4035 |
243
+ | 2.567 | 22.2460 | 199680 | 4.3419 |
244
+ | 2.567 | 22.3601 | 200704 | 4.3855 |
245
+ | 2.567 | 22.4742 | 201728 | 4.4563 |
246
+ | 2.567 | 22.5882 | 202752 | 4.3289 |
247
+ | 2.567 | 22.7023 | 203776 | 4.3813 |
248
+ | 2.567 | 22.8164 | 204800 | 4.4430 |
249
+ | 2.567 | 22.9305 | 205824 | 4.4219 |
250
+ | 2.5508 | 23.0446 | 206848 | 4.3792 |
251
+ | 2.5508 | 23.1586 | 207872 | 4.3852 |
252
+ | 2.5508 | 23.2727 | 208896 | 4.3416 |
253
+ | 2.5508 | 23.3868 | 209920 | 4.4151 |
254
+ | 2.5508 | 23.5009 | 210944 | 4.4419 |
255
+ | 2.5508 | 23.6150 | 211968 | 4.3499 |
256
+ | 2.5508 | 23.7291 | 212992 | 4.3682 |
257
+ | 2.5508 | 23.8431 | 214016 | 4.4015 |
258
+ | 2.5508 | 23.9572 | 215040 | 4.4304 |
259
+ | 2.5357 | 24.0713 | 216064 | 4.3552 |
260
+ | 2.5357 | 24.1854 | 217088 | 4.4245 |
261
+ | 2.5357 | 24.2995 | 218112 | 4.3834 |
262
+ | 2.5357 | 24.4135 | 219136 | 4.4137 |
263
+ | 2.5357 | 24.5276 | 220160 | 4.3576 |
264
+ | 2.5357 | 24.6417 | 221184 | 4.4199 |
265
+ | 2.5357 | 24.7558 | 222208 | 4.3972 |
266
+ | 2.5357 | 24.8699 | 223232 | 4.3985 |
267
+ | 2.5357 | 24.9840 | 224256 | 4.4293 |
268
+ | 2.5209 | 25.0980 | 225280 | 4.4578 |
269
+ | 2.5209 | 25.2121 | 226304 | 4.4607 |
270
+ | 2.5209 | 25.3262 | 227328 | 4.4757 |
271
+ | 2.5209 | 25.4403 | 228352 | 4.4839 |
272
+ | 2.5209 | 25.5544 | 229376 | 4.4599 |
273
+ | 2.5209 | 25.6684 | 230400 | 4.4425 |
274
+ | 2.5209 | 25.7825 | 231424 | 4.4190 |
275
+ | 2.5209 | 25.8966 | 232448 | 4.4437 |
276
+ | 2.5076 | 26.0107 | 233472 | 4.4285 |
277
+ | 2.5076 | 26.1248 | 234496 | 4.4859 |
278
+ | 2.5076 | 26.2389 | 235520 | 4.4197 |
279
+ | 2.5076 | 26.3529 | 236544 | 4.4417 |
280
+ | 2.5076 | 26.4670 | 237568 | 4.3522 |
281
+ | 2.5076 | 26.5811 | 238592 | 4.3813 |
282
+ | 2.5076 | 26.6952 | 239616 | 4.4286 |
283
+ | 2.5076 | 26.8093 | 240640 | 4.4190 |
284
+ | 2.5076 | 26.9234 | 241664 | 4.4582 |
285
+ | 2.4948 | 27.0374 | 242688 | 4.4244 |
286
+ | 2.4948 | 27.1515 | 243712 | 4.4591 |
287
+ | 2.4948 | 27.2656 | 244736 | 4.3931 |
288
+ | 2.4948 | 27.3797 | 245760 | 4.3863 |
289
+ | 2.4948 | 27.4938 | 246784 | 4.4838 |
290
+ | 2.4948 | 27.6078 | 247808 | 4.4084 |
291
+ | 2.4948 | 27.7219 | 248832 | 4.4773 |
292
+ | 2.4948 | 27.8360 | 249856 | 4.5461 |
293
+ | 2.4948 | 27.9501 | 250880 | 4.4207 |
294
+ | 2.4821 | 28.0642 | 251904 | 4.4936 |
295
+ | 2.4821 | 28.1783 | 252928 | 4.4527 |
296
+ | 2.4821 | 28.2923 | 253952 | 4.5058 |
297
+ | 2.4821 | 28.4064 | 254976 | 4.4861 |
298
+ | 2.4821 | 28.5205 | 256000 | 4.4809 |
299
+ | 2.4821 | 28.6346 | 257024 | 4.4766 |
300
+ | 2.4821 | 28.7487 | 258048 | 4.4536 |
301
+ | 2.4821 | 28.8627 | 259072 | 4.4361 |
302
+ | 2.4821 | 28.9768 | 260096 | 4.4896 |
303
+ | 2.4703 | 29.0909 | 261120 | 4.4662 |
304
+ | 2.4703 | 29.2050 | 262144 | 4.4739 |
305
+ | 2.4703 | 29.3191 | 263168 | 4.4503 |
306
+ | 2.4703 | 29.4332 | 264192 | 4.4752 |
307
+ | 2.4703 | 29.5472 | 265216 | 4.4353 |
308
+ | 2.4703 | 29.6613 | 266240 | 4.5091 |
309
+ | 2.4703 | 29.7754 | 267264 | 4.5209 |
310
+ | 2.4703 | 29.8895 | 268288 | 4.4949 |
311
+ | 2.4586 | 30.0036 | 269312 | 4.4982 |
312
+ | 2.4586 | 30.1176 | 270336 | 4.4525 |
313
+ | 2.4586 | 30.2317 | 271360 | 4.4358 |
314
+ | 2.4586 | 30.3458 | 272384 | 4.4653 |
315
+ | 2.4586 | 30.4599 | 273408 | 4.4581 |
316
+ | 2.4586 | 30.5740 | 274432 | 4.4629 |
317
+ | 2.4586 | 30.6881 | 275456 | 4.4738 |
318
+ | 2.4586 | 30.8021 | 276480 | 4.4619 |
319
+ | 2.4586 | 30.9162 | 277504 | 4.4367 |
320
+ | 2.4484 | 31.0303 | 278528 | 4.4268 |
321
+ | 2.4484 | 31.1444 | 279552 | 4.4684 |
322
+ | 2.4484 | 31.2585 | 280576 | 4.5192 |
323
+ | 2.4484 | 31.3725 | 281600 | 4.4801 |
324
+ | 2.4484 | 31.4866 | 282624 | 4.4949 |
325
+ | 2.4484 | 31.6007 | 283648 | 4.5451 |
326
+ | 2.4484 | 31.7148 | 284672 | 4.4388 |
327
+ | 2.4484 | 31.8289 | 285696 | 4.4732 |
328
+ | 2.4484 | 31.9430 | 286720 | 4.5024 |
329
+ | 2.4377 | 32.0570 | 287744 | 4.4771 |
330
+ | 2.4377 | 32.1711 | 288768 | 4.4566 |
331
+ | 2.4377 | 32.2852 | 289792 | 4.5154 |
332
+ | 2.4377 | 32.3993 | 290816 | 4.4805 |
333
+ | 2.4377 | 32.5134 | 291840 | 4.4655 |
334
+ | 2.4377 | 32.6275 | 292864 | 4.4778 |
335
+ | 2.4377 | 32.7415 | 293888 | 4.4634 |
336
+ | 2.4377 | 32.8556 | 294912 | 4.4257 |
337
+ | 2.4377 | 32.9697 | 295936 | 4.4286 |
338
+ | 2.4278 | 33.0838 | 296960 | 4.4249 |
339
+ | 2.4278 | 33.1979 | 297984 | 4.4411 |
340
+ | 2.4278 | 33.3119 | 299008 | 4.5060 |
341
+ | 2.4278 | 33.4260 | 300032 | 4.5065 |
342
+ | 2.4278 | 33.5401 | 301056 | 4.5621 |
343
+ | 2.4278 | 33.6542 | 302080 | 4.5226 |
344
+ | 2.4278 | 33.7683 | 303104 | 4.5252 |
345
+ | 2.4278 | 33.8824 | 304128 | 4.5045 |
346
+ | 2.4278 | 33.9964 | 305152 | 4.5031 |
347
+ | 2.4191 | 34.1105 | 306176 | 4.5350 |
348
+ | 2.4191 | 34.2246 | 307200 | 4.5197 |
349
+ | 2.4191 | 34.3387 | 308224 | 4.4793 |
350
+ | 2.4191 | 34.4528 | 309248 | 4.4792 |
351
+ | 2.4191 | 34.5668 | 310272 | 4.4857 |
352
+ | 2.4191 | 34.6809 | 311296 | 4.5201 |
353
+ | 2.4191 | 34.7950 | 312320 | 4.5428 |
354
+ | 2.4191 | 34.9091 | 313344 | 4.5224 |
355
+ | 2.4105 | 35.0232 | 314368 | 4.5737 |
356
+ | 2.4105 | 35.1373 | 315392 | 4.5060 |
357
+ | 2.4105 | 35.2513 | 316416 | 4.5159 |
358
+ | 2.4105 | 35.3654 | 317440 | 4.5259 |
359
+ | 2.4105 | 35.4795 | 318464 | 4.5303 |
360
+ | 2.4105 | 35.5936 | 319488 | 4.5362 |
361
+ | 2.4105 | 35.7077 | 320512 | 4.4980 |
362
+ | 2.4105 | 35.8217 | 321536 | 4.5364 |
363
+ | 2.4105 | 35.9358 | 322560 | 4.5109 |
364
+ | 2.4027 | 36.0499 | 323584 | 4.5286 |
365
+ | 2.4027 | 36.1640 | 324608 | 4.5492 |
366
+ | 2.4027 | 36.2781 | 325632 | 4.5502 |
367
+ | 2.4027 | 36.3922 | 326656 | 4.5376 |
368
+ | 2.4027 | 36.5062 | 327680 | 4.5548 |
369
+ | 2.4027 | 36.6203 | 328704 | 4.5585 |
370
+ | 2.4027 | 36.7344 | 329728 | 4.5391 |
371
+ | 2.4027 | 36.8485 | 330752 | 4.5169 |
372
+ | 2.4027 | 36.9626 | 331776 | 4.5645 |
373
+ | 2.3949 | 37.0766 | 332800 | 4.5315 |
374
+ | 2.3949 | 37.1907 | 333824 | 4.5602 |
375
+ | 2.3949 | 37.3048 | 334848 | 4.5836 |
376
+ | 2.3949 | 37.4189 | 335872 | 4.5543 |
377
+ | 2.3949 | 37.5330 | 336896 | 4.5701 |
378
+ | 2.3949 | 37.6471 | 337920 | 4.5637 |
379
+ | 2.3949 | 37.7611 | 338944 | 4.5593 |
380
+ | 2.3949 | 37.8752 | 339968 | 4.5251 |
381
+ | 2.3949 | 37.9893 | 340992 | 4.5588 |
382
+ | 2.3886 | 38.1034 | 342016 | 4.5381 |
383
+ | 2.3886 | 38.2175 | 343040 | 4.5579 |
384
+ | 2.3886 | 38.3316 | 344064 | 4.5815 |
385
+ | 2.3886 | 38.4456 | 345088 | 4.5365 |
386
+ | 2.3886 | 38.5597 | 346112 | 4.5599 |
387
+ | 2.3886 | 38.6738 | 347136 | 4.5676 |
388
+ | 2.3886 | 38.7879 | 348160 | 4.5568 |
389
+ | 2.3886 | 38.9020 | 349184 | 4.5606 |
390
+ | 2.3827 | 39.0160 | 350208 | 4.5582 |
391
+ | 2.3827 | 39.1301 | 351232 | 4.5555 |
392
+ | 2.3827 | 39.2442 | 352256 | 4.5742 |
393
+ | 2.3827 | 39.3583 | 353280 | 4.5567 |
394
+ | 2.3827 | 39.4724 | 354304 | 4.5556 |
395
+ | 2.3827 | 39.5865 | 355328 | 4.5705 |
396
+ | 2.3827 | 39.7005 | 356352 | 4.5531 |
397
+ | 2.3827 | 39.8146 | 357376 | 4.5598 |
398
+ | 2.3827 | 39.9287 | 358400 | 4.5574 |
399
+
400
+
401
+ ### Framework versions
402
+
403
+ - Transformers 4.40.2
404
+ - Pytorch 2.3.0
405
+ - Datasets 2.19.1
406
+ - Tokenizers 0.19.1
best/config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Transformer"
4
+ ],
5
+ "d_model": 512,
6
+ "dim_feedforward": 1024,
7
+ "dropout": 0.1,
8
+ "input_dim": 30000,
9
+ "max_seq_len": 2000,
10
+ "model_type": "transformer",
11
+ "nhead": 8,
12
+ "num_decoder_layers": 4,
13
+ "num_encoder_layers": 4,
14
+ "output_dim": 30000,
15
+ "torch_dtype": "float32",
16
+ "transformers_version": "4.40.2"
17
+ }
best/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:003434216bf2d6d80d5a96c2e7c47a11efda762540401639ed352d27e7407972
3
+ size 211234576
best/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:66df40aa38654c384f06406e1519aa64c20a000003dea0465f50174d4a352725
3
+ size 5112
config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Transformer"
4
+ ],
5
+ "d_model": 512,
6
+ "dim_feedforward": 1024,
7
+ "dropout": 0.1,
8
+ "input_dim": 30000,
9
+ "max_seq_len": 2000,
10
+ "model_type": "transformer",
11
+ "nhead": 8,
12
+ "num_decoder_layers": 4,
13
+ "num_encoder_layers": 4,
14
+ "output_dim": 30000,
15
+ "torch_dtype": "float32",
16
+ "transformers_version": "4.40.2"
17
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:003434216bf2d6d80d5a96c2e7c47a11efda762540401639ed352d27e7407972
3
+ size 211234576
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:66df40aa38654c384f06406e1519aa64c20a000003dea0465f50174d4a352725
3
+ size 5112