Model save
Browse files- README.md +406 -0
- best/config.json +17 -0
- best/model.safetensors +3 -0
- best/training_args.bin +3 -0
- config.json +17 -0
- model.safetensors +3 -0
- training_args.bin +3 -0
README.md
ADDED
@@ -0,0 +1,406 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- generated_from_trainer
|
4 |
+
model-index:
|
5 |
+
- name: results_model3
|
6 |
+
results: []
|
7 |
+
---
|
8 |
+
|
9 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
10 |
+
should probably proofread and complete it, then remove this comment. -->
|
11 |
+
|
12 |
+
# results_model3
|
13 |
+
|
14 |
+
This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
|
15 |
+
It achieves the following results on the evaluation set:
|
16 |
+
- Loss: 4.5574
|
17 |
+
|
18 |
+
## Model description
|
19 |
+
|
20 |
+
More information needed
|
21 |
+
|
22 |
+
## Intended uses & limitations
|
23 |
+
|
24 |
+
More information needed
|
25 |
+
|
26 |
+
## Training and evaluation data
|
27 |
+
|
28 |
+
More information needed
|
29 |
+
|
30 |
+
## Training procedure
|
31 |
+
|
32 |
+
### Training hyperparameters
|
33 |
+
|
34 |
+
The following hyperparameters were used during training:
|
35 |
+
- learning_rate: 0.0001
|
36 |
+
- train_batch_size: 32
|
37 |
+
- eval_batch_size: 32
|
38 |
+
- seed: 42
|
39 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
40 |
+
- lr_scheduler_type: linear
|
41 |
+
- lr_scheduler_warmup_steps: 30
|
42 |
+
- num_epochs: 40
|
43 |
+
- mixed_precision_training: Native AMP
|
44 |
+
|
45 |
+
### Training results
|
46 |
+
|
47 |
+
| Training Loss | Epoch | Step | Validation Loss |
|
48 |
+
|:-------------:|:-------:|:------:|:---------------:|
|
49 |
+
| No log | 0.1141 | 1024 | 6.0292 |
|
50 |
+
| No log | 0.2282 | 2048 | 5.6435 |
|
51 |
+
| No log | 0.3422 | 3072 | 5.3662 |
|
52 |
+
| No log | 0.4563 | 4096 | 5.1810 |
|
53 |
+
| No log | 0.5704 | 5120 | 5.0305 |
|
54 |
+
| No log | 0.6845 | 6144 | 4.9323 |
|
55 |
+
| No log | 0.7986 | 7168 | 4.8141 |
|
56 |
+
| No log | 0.9127 | 8192 | 4.7390 |
|
57 |
+
| 5.3905 | 1.0267 | 9216 | 4.7245 |
|
58 |
+
| 5.3905 | 1.1408 | 10240 | 4.5491 |
|
59 |
+
| 5.3905 | 1.2549 | 11264 | 4.5115 |
|
60 |
+
| 5.3905 | 1.3690 | 12288 | 4.5454 |
|
61 |
+
| 5.3905 | 1.4831 | 13312 | 4.4235 |
|
62 |
+
| 5.3905 | 1.5971 | 14336 | 4.3937 |
|
63 |
+
| 5.3905 | 1.7112 | 15360 | 4.3675 |
|
64 |
+
| 5.3905 | 1.8253 | 16384 | 4.3321 |
|
65 |
+
| 5.3905 | 1.9394 | 17408 | 4.3125 |
|
66 |
+
| 4.3288 | 2.0535 | 18432 | 4.2470 |
|
67 |
+
| 4.3288 | 2.1676 | 19456 | 4.3321 |
|
68 |
+
| 4.3288 | 2.2816 | 20480 | 4.2336 |
|
69 |
+
| 4.3288 | 2.3957 | 21504 | 4.2677 |
|
70 |
+
| 4.3288 | 2.5098 | 22528 | 4.2491 |
|
71 |
+
| 4.3288 | 2.6239 | 23552 | 4.3101 |
|
72 |
+
| 4.3288 | 2.7380 | 24576 | 4.3300 |
|
73 |
+
| 4.3288 | 2.8520 | 25600 | 4.3947 |
|
74 |
+
| 4.3288 | 2.9661 | 26624 | 4.3634 |
|
75 |
+
| 3.853 | 3.0802 | 27648 | 4.2981 |
|
76 |
+
| 3.853 | 3.1943 | 28672 | 4.4073 |
|
77 |
+
| 3.853 | 3.3084 | 29696 | 4.3586 |
|
78 |
+
| 3.853 | 3.4225 | 30720 | 4.5024 |
|
79 |
+
| 3.853 | 3.5365 | 31744 | 4.6206 |
|
80 |
+
| 3.853 | 3.6506 | 32768 | 4.5310 |
|
81 |
+
| 3.853 | 3.7647 | 33792 | 4.6789 |
|
82 |
+
| 3.853 | 3.8788 | 34816 | 4.5824 |
|
83 |
+
| 3.853 | 3.9929 | 35840 | 4.4508 |
|
84 |
+
| 3.5631 | 4.1070 | 36864 | 4.5873 |
|
85 |
+
| 3.5631 | 4.2210 | 37888 | 4.5861 |
|
86 |
+
| 3.5631 | 4.3351 | 38912 | 4.6930 |
|
87 |
+
| 3.5631 | 4.4492 | 39936 | 4.6392 |
|
88 |
+
| 3.5631 | 4.5633 | 40960 | 4.5271 |
|
89 |
+
| 3.5631 | 4.6774 | 41984 | 4.8197 |
|
90 |
+
| 3.5631 | 4.7914 | 43008 | 4.7696 |
|
91 |
+
| 3.5631 | 4.9055 | 44032 | 4.6841 |
|
92 |
+
| 3.3655 | 5.0196 | 45056 | 4.8238 |
|
93 |
+
| 3.3655 | 5.1337 | 46080 | 4.8119 |
|
94 |
+
| 3.3655 | 5.2478 | 47104 | 4.8520 |
|
95 |
+
| 3.3655 | 5.3619 | 48128 | 4.8713 |
|
96 |
+
| 3.3655 | 5.4759 | 49152 | 4.8548 |
|
97 |
+
| 3.3655 | 5.5900 | 50176 | 4.7750 |
|
98 |
+
| 3.3655 | 5.7041 | 51200 | 5.0075 |
|
99 |
+
| 3.3655 | 5.8182 | 52224 | 4.9843 |
|
100 |
+
| 3.3655 | 5.9323 | 53248 | 4.8895 |
|
101 |
+
| 3.2228 | 6.0463 | 54272 | 5.0877 |
|
102 |
+
| 3.2228 | 6.1604 | 55296 | 4.6181 |
|
103 |
+
| 3.2228 | 6.2745 | 56320 | 4.7398 |
|
104 |
+
| 3.2228 | 6.3886 | 57344 | 4.6617 |
|
105 |
+
| 3.2228 | 6.5027 | 58368 | 4.8633 |
|
106 |
+
| 3.2228 | 6.6168 | 59392 | 4.9870 |
|
107 |
+
| 3.2228 | 6.7308 | 60416 | 5.0021 |
|
108 |
+
| 3.2228 | 6.8449 | 61440 | 4.7422 |
|
109 |
+
| 3.2228 | 6.9590 | 62464 | 4.9250 |
|
110 |
+
| 3.1155 | 7.0731 | 63488 | 4.7348 |
|
111 |
+
| 3.1155 | 7.1872 | 64512 | 4.8952 |
|
112 |
+
| 3.1155 | 7.3012 | 65536 | 4.8318 |
|
113 |
+
| 3.1155 | 7.4153 | 66560 | 4.8476 |
|
114 |
+
| 3.1155 | 7.5294 | 67584 | 5.0057 |
|
115 |
+
| 3.1155 | 7.6435 | 68608 | 4.9427 |
|
116 |
+
| 3.1155 | 7.7576 | 69632 | 4.8623 |
|
117 |
+
| 3.1155 | 7.8717 | 70656 | 4.8452 |
|
118 |
+
| 3.1155 | 7.9857 | 71680 | 4.8021 |
|
119 |
+
| 3.0315 | 8.0998 | 72704 | 4.7947 |
|
120 |
+
| 3.0315 | 8.2139 | 73728 | 4.8884 |
|
121 |
+
| 3.0315 | 8.3280 | 74752 | 4.8440 |
|
122 |
+
| 3.0315 | 8.4421 | 75776 | 4.8452 |
|
123 |
+
| 3.0315 | 8.5561 | 76800 | 4.8929 |
|
124 |
+
| 3.0315 | 8.6702 | 77824 | 4.8412 |
|
125 |
+
| 3.0315 | 8.7843 | 78848 | 4.8926 |
|
126 |
+
| 3.0315 | 8.8984 | 79872 | 4.6749 |
|
127 |
+
| 2.9626 | 9.0125 | 80896 | 4.9204 |
|
128 |
+
| 2.9626 | 9.1266 | 81920 | 4.7448 |
|
129 |
+
| 2.9626 | 9.2406 | 82944 | 4.6707 |
|
130 |
+
| 2.9626 | 9.3547 | 83968 | 4.7409 |
|
131 |
+
| 2.9626 | 9.4688 | 84992 | 4.7350 |
|
132 |
+
| 2.9626 | 9.5829 | 86016 | 4.7875 |
|
133 |
+
| 2.9626 | 9.6970 | 87040 | 4.7588 |
|
134 |
+
| 2.9626 | 9.8111 | 88064 | 4.6576 |
|
135 |
+
| 2.9626 | 9.9251 | 89088 | 4.6979 |
|
136 |
+
| 2.9057 | 10.0392 | 90112 | 4.7358 |
|
137 |
+
| 2.9057 | 10.1533 | 91136 | 4.7766 |
|
138 |
+
| 2.9057 | 10.2674 | 92160 | 4.7182 |
|
139 |
+
| 2.9057 | 10.3815 | 93184 | 4.7157 |
|
140 |
+
| 2.9057 | 10.4955 | 94208 | 4.6437 |
|
141 |
+
| 2.9057 | 10.6096 | 95232 | 4.6506 |
|
142 |
+
| 2.9057 | 10.7237 | 96256 | 4.6468 |
|
143 |
+
| 2.9057 | 10.8378 | 97280 | 4.6038 |
|
144 |
+
| 2.9057 | 10.9519 | 98304 | 4.7661 |
|
145 |
+
| 2.857 | 11.0660 | 99328 | 4.7447 |
|
146 |
+
| 2.857 | 11.1800 | 100352 | 4.5949 |
|
147 |
+
| 2.857 | 11.2941 | 101376 | 4.6705 |
|
148 |
+
| 2.857 | 11.4082 | 102400 | 4.7022 |
|
149 |
+
| 2.857 | 11.5223 | 103424 | 4.6394 |
|
150 |
+
| 2.857 | 11.6364 | 104448 | 4.7558 |
|
151 |
+
| 2.857 | 11.7504 | 105472 | 4.7065 |
|
152 |
+
| 2.857 | 11.8645 | 106496 | 4.4721 |
|
153 |
+
| 2.857 | 11.9786 | 107520 | 4.6075 |
|
154 |
+
| 2.8152 | 12.0927 | 108544 | 4.6613 |
|
155 |
+
| 2.8152 | 12.2068 | 109568 | 4.6763 |
|
156 |
+
| 2.8152 | 12.3209 | 110592 | 4.5310 |
|
157 |
+
| 2.8152 | 12.4349 | 111616 | 4.6142 |
|
158 |
+
| 2.8152 | 12.5490 | 112640 | 4.5820 |
|
159 |
+
| 2.8152 | 12.6631 | 113664 | 4.6034 |
|
160 |
+
| 2.8152 | 12.7772 | 114688 | 4.6213 |
|
161 |
+
| 2.8152 | 12.8913 | 115712 | 4.6057 |
|
162 |
+
| 2.7789 | 13.0053 | 116736 | 4.6278 |
|
163 |
+
| 2.7789 | 13.1194 | 117760 | 4.5454 |
|
164 |
+
| 2.7789 | 13.2335 | 118784 | 4.6711 |
|
165 |
+
| 2.7789 | 13.3476 | 119808 | 4.5148 |
|
166 |
+
| 2.7789 | 13.4617 | 120832 | 4.5424 |
|
167 |
+
| 2.7789 | 13.5758 | 121856 | 4.5074 |
|
168 |
+
| 2.7789 | 13.6898 | 122880 | 4.5448 |
|
169 |
+
| 2.7789 | 13.8039 | 123904 | 4.5084 |
|
170 |
+
| 2.7789 | 13.9180 | 124928 | 4.4962 |
|
171 |
+
| 2.7457 | 14.0321 | 125952 | 4.5270 |
|
172 |
+
| 2.7457 | 14.1462 | 126976 | 4.3997 |
|
173 |
+
| 2.7457 | 14.2602 | 128000 | 4.5798 |
|
174 |
+
| 2.7457 | 14.3743 | 129024 | 4.5839 |
|
175 |
+
| 2.7457 | 14.4884 | 130048 | 4.5679 |
|
176 |
+
| 2.7457 | 14.6025 | 131072 | 4.4674 |
|
177 |
+
| 2.7457 | 14.7166 | 132096 | 4.4471 |
|
178 |
+
| 2.7457 | 14.8307 | 133120 | 4.3811 |
|
179 |
+
| 2.7457 | 14.9447 | 134144 | 4.4387 |
|
180 |
+
| 2.7165 | 15.0588 | 135168 | 4.4756 |
|
181 |
+
| 2.7165 | 15.1729 | 136192 | 4.5638 |
|
182 |
+
| 2.7165 | 15.2870 | 137216 | 4.4033 |
|
183 |
+
| 2.7165 | 15.4011 | 138240 | 4.4876 |
|
184 |
+
| 2.7165 | 15.5152 | 139264 | 4.3874 |
|
185 |
+
| 2.7165 | 15.6292 | 140288 | 4.4200 |
|
186 |
+
| 2.7165 | 15.7433 | 141312 | 4.5077 |
|
187 |
+
| 2.7165 | 15.8574 | 142336 | 4.4537 |
|
188 |
+
| 2.7165 | 15.9715 | 143360 | 4.4381 |
|
189 |
+
| 2.6895 | 16.0856 | 144384 | 4.5636 |
|
190 |
+
| 2.6895 | 16.1996 | 145408 | 4.3530 |
|
191 |
+
| 2.6895 | 16.3137 | 146432 | 4.3760 |
|
192 |
+
| 2.6895 | 16.4278 | 147456 | 4.4327 |
|
193 |
+
| 2.6895 | 16.5419 | 148480 | 4.3666 |
|
194 |
+
| 2.6895 | 16.6560 | 149504 | 4.3708 |
|
195 |
+
| 2.6895 | 16.7701 | 150528 | 4.3945 |
|
196 |
+
| 2.6895 | 16.8841 | 151552 | 4.3781 |
|
197 |
+
| 2.6895 | 16.9982 | 152576 | 4.4506 |
|
198 |
+
| 2.6652 | 17.1123 | 153600 | 4.3923 |
|
199 |
+
| 2.6652 | 17.2264 | 154624 | 4.4244 |
|
200 |
+
| 2.6652 | 17.3405 | 155648 | 4.4576 |
|
201 |
+
| 2.6652 | 17.4545 | 156672 | 4.5156 |
|
202 |
+
| 2.6652 | 17.5686 | 157696 | 4.4249 |
|
203 |
+
| 2.6652 | 17.6827 | 158720 | 4.3867 |
|
204 |
+
| 2.6652 | 17.7968 | 159744 | 4.4360 |
|
205 |
+
| 2.6652 | 17.9109 | 160768 | 4.4036 |
|
206 |
+
| 2.6426 | 18.0250 | 161792 | 4.3103 |
|
207 |
+
| 2.6426 | 18.1390 | 162816 | 4.4384 |
|
208 |
+
| 2.6426 | 18.2531 | 163840 | 4.4340 |
|
209 |
+
| 2.6426 | 18.3672 | 164864 | 4.4168 |
|
210 |
+
| 2.6426 | 18.4813 | 165888 | 4.3282 |
|
211 |
+
| 2.6426 | 18.5954 | 166912 | 4.3200 |
|
212 |
+
| 2.6426 | 18.7094 | 167936 | 4.2999 |
|
213 |
+
| 2.6426 | 18.8235 | 168960 | 4.4347 |
|
214 |
+
| 2.6426 | 18.9376 | 169984 | 4.4230 |
|
215 |
+
| 2.6219 | 19.0517 | 171008 | 4.4185 |
|
216 |
+
| 2.6219 | 19.1658 | 172032 | 4.3904 |
|
217 |
+
| 2.6219 | 19.2799 | 173056 | 4.4376 |
|
218 |
+
| 2.6219 | 19.3939 | 174080 | 4.3366 |
|
219 |
+
| 2.6219 | 19.5080 | 175104 | 4.4409 |
|
220 |
+
| 2.6219 | 19.6221 | 176128 | 4.3827 |
|
221 |
+
| 2.6219 | 19.7362 | 177152 | 4.4327 |
|
222 |
+
| 2.6219 | 19.8503 | 178176 | 4.4141 |
|
223 |
+
| 2.6219 | 19.9643 | 179200 | 4.4321 |
|
224 |
+
| 2.6027 | 20.0784 | 180224 | 4.2911 |
|
225 |
+
| 2.6027 | 20.1925 | 181248 | 4.3532 |
|
226 |
+
| 2.6027 | 20.3066 | 182272 | 4.3809 |
|
227 |
+
| 2.6027 | 20.4207 | 183296 | 4.3316 |
|
228 |
+
| 2.6027 | 20.5348 | 184320 | 4.4209 |
|
229 |
+
| 2.6027 | 20.6488 | 185344 | 4.4665 |
|
230 |
+
| 2.6027 | 20.7629 | 186368 | 4.4491 |
|
231 |
+
| 2.6027 | 20.8770 | 187392 | 4.5202 |
|
232 |
+
| 2.6027 | 20.9911 | 188416 | 4.3736 |
|
233 |
+
| 2.5844 | 21.1052 | 189440 | 4.3502 |
|
234 |
+
| 2.5844 | 21.2193 | 190464 | 4.4119 |
|
235 |
+
| 2.5844 | 21.3333 | 191488 | 4.5101 |
|
236 |
+
| 2.5844 | 21.4474 | 192512 | 4.4317 |
|
237 |
+
| 2.5844 | 21.5615 | 193536 | 4.4820 |
|
238 |
+
| 2.5844 | 21.6756 | 194560 | 4.3390 |
|
239 |
+
| 2.5844 | 21.7897 | 195584 | 4.5056 |
|
240 |
+
| 2.5844 | 21.9037 | 196608 | 4.3455 |
|
241 |
+
| 2.567 | 22.0178 | 197632 | 4.4092 |
|
242 |
+
| 2.567 | 22.1319 | 198656 | 4.4035 |
|
243 |
+
| 2.567 | 22.2460 | 199680 | 4.3419 |
|
244 |
+
| 2.567 | 22.3601 | 200704 | 4.3855 |
|
245 |
+
| 2.567 | 22.4742 | 201728 | 4.4563 |
|
246 |
+
| 2.567 | 22.5882 | 202752 | 4.3289 |
|
247 |
+
| 2.567 | 22.7023 | 203776 | 4.3813 |
|
248 |
+
| 2.567 | 22.8164 | 204800 | 4.4430 |
|
249 |
+
| 2.567 | 22.9305 | 205824 | 4.4219 |
|
250 |
+
| 2.5508 | 23.0446 | 206848 | 4.3792 |
|
251 |
+
| 2.5508 | 23.1586 | 207872 | 4.3852 |
|
252 |
+
| 2.5508 | 23.2727 | 208896 | 4.3416 |
|
253 |
+
| 2.5508 | 23.3868 | 209920 | 4.4151 |
|
254 |
+
| 2.5508 | 23.5009 | 210944 | 4.4419 |
|
255 |
+
| 2.5508 | 23.6150 | 211968 | 4.3499 |
|
256 |
+
| 2.5508 | 23.7291 | 212992 | 4.3682 |
|
257 |
+
| 2.5508 | 23.8431 | 214016 | 4.4015 |
|
258 |
+
| 2.5508 | 23.9572 | 215040 | 4.4304 |
|
259 |
+
| 2.5357 | 24.0713 | 216064 | 4.3552 |
|
260 |
+
| 2.5357 | 24.1854 | 217088 | 4.4245 |
|
261 |
+
| 2.5357 | 24.2995 | 218112 | 4.3834 |
|
262 |
+
| 2.5357 | 24.4135 | 219136 | 4.4137 |
|
263 |
+
| 2.5357 | 24.5276 | 220160 | 4.3576 |
|
264 |
+
| 2.5357 | 24.6417 | 221184 | 4.4199 |
|
265 |
+
| 2.5357 | 24.7558 | 222208 | 4.3972 |
|
266 |
+
| 2.5357 | 24.8699 | 223232 | 4.3985 |
|
267 |
+
| 2.5357 | 24.9840 | 224256 | 4.4293 |
|
268 |
+
| 2.5209 | 25.0980 | 225280 | 4.4578 |
|
269 |
+
| 2.5209 | 25.2121 | 226304 | 4.4607 |
|
270 |
+
| 2.5209 | 25.3262 | 227328 | 4.4757 |
|
271 |
+
| 2.5209 | 25.4403 | 228352 | 4.4839 |
|
272 |
+
| 2.5209 | 25.5544 | 229376 | 4.4599 |
|
273 |
+
| 2.5209 | 25.6684 | 230400 | 4.4425 |
|
274 |
+
| 2.5209 | 25.7825 | 231424 | 4.4190 |
|
275 |
+
| 2.5209 | 25.8966 | 232448 | 4.4437 |
|
276 |
+
| 2.5076 | 26.0107 | 233472 | 4.4285 |
|
277 |
+
| 2.5076 | 26.1248 | 234496 | 4.4859 |
|
278 |
+
| 2.5076 | 26.2389 | 235520 | 4.4197 |
|
279 |
+
| 2.5076 | 26.3529 | 236544 | 4.4417 |
|
280 |
+
| 2.5076 | 26.4670 | 237568 | 4.3522 |
|
281 |
+
| 2.5076 | 26.5811 | 238592 | 4.3813 |
|
282 |
+
| 2.5076 | 26.6952 | 239616 | 4.4286 |
|
283 |
+
| 2.5076 | 26.8093 | 240640 | 4.4190 |
|
284 |
+
| 2.5076 | 26.9234 | 241664 | 4.4582 |
|
285 |
+
| 2.4948 | 27.0374 | 242688 | 4.4244 |
|
286 |
+
| 2.4948 | 27.1515 | 243712 | 4.4591 |
|
287 |
+
| 2.4948 | 27.2656 | 244736 | 4.3931 |
|
288 |
+
| 2.4948 | 27.3797 | 245760 | 4.3863 |
|
289 |
+
| 2.4948 | 27.4938 | 246784 | 4.4838 |
|
290 |
+
| 2.4948 | 27.6078 | 247808 | 4.4084 |
|
291 |
+
| 2.4948 | 27.7219 | 248832 | 4.4773 |
|
292 |
+
| 2.4948 | 27.8360 | 249856 | 4.5461 |
|
293 |
+
| 2.4948 | 27.9501 | 250880 | 4.4207 |
|
294 |
+
| 2.4821 | 28.0642 | 251904 | 4.4936 |
|
295 |
+
| 2.4821 | 28.1783 | 252928 | 4.4527 |
|
296 |
+
| 2.4821 | 28.2923 | 253952 | 4.5058 |
|
297 |
+
| 2.4821 | 28.4064 | 254976 | 4.4861 |
|
298 |
+
| 2.4821 | 28.5205 | 256000 | 4.4809 |
|
299 |
+
| 2.4821 | 28.6346 | 257024 | 4.4766 |
|
300 |
+
| 2.4821 | 28.7487 | 258048 | 4.4536 |
|
301 |
+
| 2.4821 | 28.8627 | 259072 | 4.4361 |
|
302 |
+
| 2.4821 | 28.9768 | 260096 | 4.4896 |
|
303 |
+
| 2.4703 | 29.0909 | 261120 | 4.4662 |
|
304 |
+
| 2.4703 | 29.2050 | 262144 | 4.4739 |
|
305 |
+
| 2.4703 | 29.3191 | 263168 | 4.4503 |
|
306 |
+
| 2.4703 | 29.4332 | 264192 | 4.4752 |
|
307 |
+
| 2.4703 | 29.5472 | 265216 | 4.4353 |
|
308 |
+
| 2.4703 | 29.6613 | 266240 | 4.5091 |
|
309 |
+
| 2.4703 | 29.7754 | 267264 | 4.5209 |
|
310 |
+
| 2.4703 | 29.8895 | 268288 | 4.4949 |
|
311 |
+
| 2.4586 | 30.0036 | 269312 | 4.4982 |
|
312 |
+
| 2.4586 | 30.1176 | 270336 | 4.4525 |
|
313 |
+
| 2.4586 | 30.2317 | 271360 | 4.4358 |
|
314 |
+
| 2.4586 | 30.3458 | 272384 | 4.4653 |
|
315 |
+
| 2.4586 | 30.4599 | 273408 | 4.4581 |
|
316 |
+
| 2.4586 | 30.5740 | 274432 | 4.4629 |
|
317 |
+
| 2.4586 | 30.6881 | 275456 | 4.4738 |
|
318 |
+
| 2.4586 | 30.8021 | 276480 | 4.4619 |
|
319 |
+
| 2.4586 | 30.9162 | 277504 | 4.4367 |
|
320 |
+
| 2.4484 | 31.0303 | 278528 | 4.4268 |
|
321 |
+
| 2.4484 | 31.1444 | 279552 | 4.4684 |
|
322 |
+
| 2.4484 | 31.2585 | 280576 | 4.5192 |
|
323 |
+
| 2.4484 | 31.3725 | 281600 | 4.4801 |
|
324 |
+
| 2.4484 | 31.4866 | 282624 | 4.4949 |
|
325 |
+
| 2.4484 | 31.6007 | 283648 | 4.5451 |
|
326 |
+
| 2.4484 | 31.7148 | 284672 | 4.4388 |
|
327 |
+
| 2.4484 | 31.8289 | 285696 | 4.4732 |
|
328 |
+
| 2.4484 | 31.9430 | 286720 | 4.5024 |
|
329 |
+
| 2.4377 | 32.0570 | 287744 | 4.4771 |
|
330 |
+
| 2.4377 | 32.1711 | 288768 | 4.4566 |
|
331 |
+
| 2.4377 | 32.2852 | 289792 | 4.5154 |
|
332 |
+
| 2.4377 | 32.3993 | 290816 | 4.4805 |
|
333 |
+
| 2.4377 | 32.5134 | 291840 | 4.4655 |
|
334 |
+
| 2.4377 | 32.6275 | 292864 | 4.4778 |
|
335 |
+
| 2.4377 | 32.7415 | 293888 | 4.4634 |
|
336 |
+
| 2.4377 | 32.8556 | 294912 | 4.4257 |
|
337 |
+
| 2.4377 | 32.9697 | 295936 | 4.4286 |
|
338 |
+
| 2.4278 | 33.0838 | 296960 | 4.4249 |
|
339 |
+
| 2.4278 | 33.1979 | 297984 | 4.4411 |
|
340 |
+
| 2.4278 | 33.3119 | 299008 | 4.5060 |
|
341 |
+
| 2.4278 | 33.4260 | 300032 | 4.5065 |
|
342 |
+
| 2.4278 | 33.5401 | 301056 | 4.5621 |
|
343 |
+
| 2.4278 | 33.6542 | 302080 | 4.5226 |
|
344 |
+
| 2.4278 | 33.7683 | 303104 | 4.5252 |
|
345 |
+
| 2.4278 | 33.8824 | 304128 | 4.5045 |
|
346 |
+
| 2.4278 | 33.9964 | 305152 | 4.5031 |
|
347 |
+
| 2.4191 | 34.1105 | 306176 | 4.5350 |
|
348 |
+
| 2.4191 | 34.2246 | 307200 | 4.5197 |
|
349 |
+
| 2.4191 | 34.3387 | 308224 | 4.4793 |
|
350 |
+
| 2.4191 | 34.4528 | 309248 | 4.4792 |
|
351 |
+
| 2.4191 | 34.5668 | 310272 | 4.4857 |
|
352 |
+
| 2.4191 | 34.6809 | 311296 | 4.5201 |
|
353 |
+
| 2.4191 | 34.7950 | 312320 | 4.5428 |
|
354 |
+
| 2.4191 | 34.9091 | 313344 | 4.5224 |
|
355 |
+
| 2.4105 | 35.0232 | 314368 | 4.5737 |
|
356 |
+
| 2.4105 | 35.1373 | 315392 | 4.5060 |
|
357 |
+
| 2.4105 | 35.2513 | 316416 | 4.5159 |
|
358 |
+
| 2.4105 | 35.3654 | 317440 | 4.5259 |
|
359 |
+
| 2.4105 | 35.4795 | 318464 | 4.5303 |
|
360 |
+
| 2.4105 | 35.5936 | 319488 | 4.5362 |
|
361 |
+
| 2.4105 | 35.7077 | 320512 | 4.4980 |
|
362 |
+
| 2.4105 | 35.8217 | 321536 | 4.5364 |
|
363 |
+
| 2.4105 | 35.9358 | 322560 | 4.5109 |
|
364 |
+
| 2.4027 | 36.0499 | 323584 | 4.5286 |
|
365 |
+
| 2.4027 | 36.1640 | 324608 | 4.5492 |
|
366 |
+
| 2.4027 | 36.2781 | 325632 | 4.5502 |
|
367 |
+
| 2.4027 | 36.3922 | 326656 | 4.5376 |
|
368 |
+
| 2.4027 | 36.5062 | 327680 | 4.5548 |
|
369 |
+
| 2.4027 | 36.6203 | 328704 | 4.5585 |
|
370 |
+
| 2.4027 | 36.7344 | 329728 | 4.5391 |
|
371 |
+
| 2.4027 | 36.8485 | 330752 | 4.5169 |
|
372 |
+
| 2.4027 | 36.9626 | 331776 | 4.5645 |
|
373 |
+
| 2.3949 | 37.0766 | 332800 | 4.5315 |
|
374 |
+
| 2.3949 | 37.1907 | 333824 | 4.5602 |
|
375 |
+
| 2.3949 | 37.3048 | 334848 | 4.5836 |
|
376 |
+
| 2.3949 | 37.4189 | 335872 | 4.5543 |
|
377 |
+
| 2.3949 | 37.5330 | 336896 | 4.5701 |
|
378 |
+
| 2.3949 | 37.6471 | 337920 | 4.5637 |
|
379 |
+
| 2.3949 | 37.7611 | 338944 | 4.5593 |
|
380 |
+
| 2.3949 | 37.8752 | 339968 | 4.5251 |
|
381 |
+
| 2.3949 | 37.9893 | 340992 | 4.5588 |
|
382 |
+
| 2.3886 | 38.1034 | 342016 | 4.5381 |
|
383 |
+
| 2.3886 | 38.2175 | 343040 | 4.5579 |
|
384 |
+
| 2.3886 | 38.3316 | 344064 | 4.5815 |
|
385 |
+
| 2.3886 | 38.4456 | 345088 | 4.5365 |
|
386 |
+
| 2.3886 | 38.5597 | 346112 | 4.5599 |
|
387 |
+
| 2.3886 | 38.6738 | 347136 | 4.5676 |
|
388 |
+
| 2.3886 | 38.7879 | 348160 | 4.5568 |
|
389 |
+
| 2.3886 | 38.9020 | 349184 | 4.5606 |
|
390 |
+
| 2.3827 | 39.0160 | 350208 | 4.5582 |
|
391 |
+
| 2.3827 | 39.1301 | 351232 | 4.5555 |
|
392 |
+
| 2.3827 | 39.2442 | 352256 | 4.5742 |
|
393 |
+
| 2.3827 | 39.3583 | 353280 | 4.5567 |
|
394 |
+
| 2.3827 | 39.4724 | 354304 | 4.5556 |
|
395 |
+
| 2.3827 | 39.5865 | 355328 | 4.5705 |
|
396 |
+
| 2.3827 | 39.7005 | 356352 | 4.5531 |
|
397 |
+
| 2.3827 | 39.8146 | 357376 | 4.5598 |
|
398 |
+
| 2.3827 | 39.9287 | 358400 | 4.5574 |
|
399 |
+
|
400 |
+
|
401 |
+
### Framework versions
|
402 |
+
|
403 |
+
- Transformers 4.40.2
|
404 |
+
- Pytorch 2.3.0
|
405 |
+
- Datasets 2.19.1
|
406 |
+
- Tokenizers 0.19.1
|
best/config.json
ADDED
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"architectures": [
|
3 |
+
"Transformer"
|
4 |
+
],
|
5 |
+
"d_model": 512,
|
6 |
+
"dim_feedforward": 1024,
|
7 |
+
"dropout": 0.1,
|
8 |
+
"input_dim": 30000,
|
9 |
+
"max_seq_len": 2000,
|
10 |
+
"model_type": "transformer",
|
11 |
+
"nhead": 8,
|
12 |
+
"num_decoder_layers": 4,
|
13 |
+
"num_encoder_layers": 4,
|
14 |
+
"output_dim": 30000,
|
15 |
+
"torch_dtype": "float32",
|
16 |
+
"transformers_version": "4.40.2"
|
17 |
+
}
|
best/model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:003434216bf2d6d80d5a96c2e7c47a11efda762540401639ed352d27e7407972
|
3 |
+
size 211234576
|
best/training_args.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:66df40aa38654c384f06406e1519aa64c20a000003dea0465f50174d4a352725
|
3 |
+
size 5112
|
config.json
ADDED
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"architectures": [
|
3 |
+
"Transformer"
|
4 |
+
],
|
5 |
+
"d_model": 512,
|
6 |
+
"dim_feedforward": 1024,
|
7 |
+
"dropout": 0.1,
|
8 |
+
"input_dim": 30000,
|
9 |
+
"max_seq_len": 2000,
|
10 |
+
"model_type": "transformer",
|
11 |
+
"nhead": 8,
|
12 |
+
"num_decoder_layers": 4,
|
13 |
+
"num_encoder_layers": 4,
|
14 |
+
"output_dim": 30000,
|
15 |
+
"torch_dtype": "float32",
|
16 |
+
"transformers_version": "4.40.2"
|
17 |
+
}
|
model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:003434216bf2d6d80d5a96c2e7c47a11efda762540401639ed352d27e7407972
|
3 |
+
size 211234576
|
training_args.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:66df40aa38654c384f06406e1519aa64c20a000003dea0465f50174d4a352725
|
3 |
+
size 5112
|