End of training
Browse files
README.md
CHANGED
@@ -15,14 +15,14 @@ This student model is distilled from the teacher model [roneneldan/TinyStories-3
|
|
15 |
The [Distily](https://github.com/lapp0/distily) library was used for this distillation.
|
16 |
|
17 |
It achieves the following results on the evaluation set:
|
18 |
-
- eval_enwikippl:
|
19 |
-
- eval_frwikippl:
|
20 |
-
- eval_zhwikippl:
|
21 |
-
- eval_tinystoriesppl:
|
22 |
-
- eval_loss:
|
23 |
-
- eval_runtime: 6.
|
24 |
-
- eval_samples_per_second: 76.
|
25 |
-
- eval_steps_per_second: 9.
|
26 |
|
27 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
28 |
should probably proofread and complete it, then remove this comment.
|
@@ -47,8 +47,8 @@ More information needed
|
|
47 |
The following hyperparameters were used during training:
|
48 |
- distillation_objective: DistillationObjective(logits_loss_component=LossComponent(label=logits, weight=1, loss_fn=kl, layer_mapper=None, projector=None), hs_loss_component=LossComponent(label=hs, weight=0, loss_fn=None, layer_mapper=None, projector=None), attn_loss_component=LossComponent(label=attn, weight=0, loss_fn=None, layer_mapper=None, projector=None))
|
49 |
- train_embeddings: True
|
50 |
-
- learning_rate: 0.
|
51 |
-
- train_batch_size:
|
52 |
- eval_batch_size: 8
|
53 |
- seed: 42
|
54 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
@@ -56,26 +56,112 @@ The following hyperparameters were used during training:
|
|
56 |
- num_epochs: 1.0
|
57 |
|
58 |
### Resource Usage
|
59 |
-
Peak GPU Memory:
|
60 |
|
61 |
### Eval-Phase Metrics
|
62 |
| step | epoch | enwikippl | frwikippl | loss | runtime | samples_per_second | steps_per_second | tinystoriesppl | zhwikippl |
|
63 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|
64 |
| **teacher eval** | | 169.9865 | 47377.9414 | | | | | 3.9789 | 4998.1294 |
|
65 |
-
| 0 | 0 |
|
66 |
-
| 500 | 0.
|
67 |
-
| 1000 | 0.
|
68 |
-
| 1500 | 0.
|
69 |
-
| 2000 | 0.
|
70 |
-
| 2500 | 0.
|
71 |
-
| 3000 | 0.
|
72 |
-
| 3500 | 0.
|
73 |
-
| 4000 | 0.
|
74 |
-
| 4500 | 0.
|
75 |
-
| 5000 | 0.
|
76 |
-
| 5500 | 0.
|
77 |
-
| 6000 | 0.
|
78 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
|
80 |
### Framework versions
|
81 |
- Distily 0.2.0
|
|
|
15 |
The [Distily](https://github.com/lapp0/distily) library was used for this distillation.
|
16 |
|
17 |
It achieves the following results on the evaluation set:
|
18 |
+
- eval_enwikippl: 184.3409
|
19 |
+
- eval_frwikippl: 58809.4336
|
20 |
+
- eval_zhwikippl: 498418.7812
|
21 |
+
- eval_tinystoriesppl: 10.4219
|
22 |
+
- eval_loss: 1.3030
|
23 |
+
- eval_runtime: 6.5437
|
24 |
+
- eval_samples_per_second: 76.409
|
25 |
+
- eval_steps_per_second: 9.628
|
26 |
|
27 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
28 |
should probably proofread and complete it, then remove this comment.
|
|
|
47 |
The following hyperparameters were used during training:
|
48 |
- distillation_objective: DistillationObjective(logits_loss_component=LossComponent(label=logits, weight=1, loss_fn=kl, layer_mapper=None, projector=None), hs_loss_component=LossComponent(label=hs, weight=0, loss_fn=None, layer_mapper=None, projector=None), attn_loss_component=LossComponent(label=attn, weight=0, loss_fn=None, layer_mapper=None, projector=None))
|
49 |
- train_embeddings: True
|
50 |
+
- learning_rate: 0.004
|
51 |
+
- train_batch_size: 1
|
52 |
- eval_batch_size: 8
|
53 |
- seed: 42
|
54 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
|
|
56 |
- num_epochs: 1.0
|
57 |
|
58 |
### Resource Usage
|
59 |
+
Peak GPU Memory: 6.6047 GB
|
60 |
|
61 |
### Eval-Phase Metrics
|
62 |
| step | epoch | enwikippl | frwikippl | loss | runtime | samples_per_second | steps_per_second | tinystoriesppl | zhwikippl |
|
63 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|
64 |
| **teacher eval** | | 169.9865 | 47377.9414 | | | | | 3.9789 | 4998.1294 |
|
65 |
+
| 0 | 0 | 9095.8965 | 52350.1094 | 6.1545 | 6.5255 | 76.623 | 9.654 | 3753.9187 | 61167.5938 |
|
66 |
+
| 500 | 0.0101 | 243.8593 | 125008.3047 | 1.4706 | 6.4961 | 76.969 | 9.698 | 11.1718 | 994172.1875 |
|
67 |
+
| 1000 | 0.0202 | 201.1267 | 66485.8203 | 1.3342 | 6.5302 | 76.567 | 9.647 | 10.9148 | 584009.625 |
|
68 |
+
| 1500 | 0.0303 | 185.1781 | 63189.5391 | 1.3091 | 6.4906 | 77.035 | 9.706 | 10.4357 | 522659.3125 |
|
69 |
+
| 2000 | 0.0404 | 184.7626 | 62235.5703 | 1.3033 | 6.5025 | 76.893 | 9.689 | 10.3532 | 500283.6562 |
|
70 |
+
| 2500 | 0.0505 | 190.2160 | 64720.9258 | 1.3022 | 6.502 | 76.9 | 9.689 | 10.4971 | 539807.5625 |
|
71 |
+
| 3000 | 0.0606 | 187.8947 | 62657.7461 | 1.3021 | 6.5475 | 76.365 | 9.622 | 10.4232 | 546764.75 |
|
72 |
+
| 3500 | 0.0707 | 187.1394 | 64284.8477 | 1.3019 | 6.5077 | 76.832 | 9.681 | 10.3626 | 568179.5 |
|
73 |
+
| 4000 | 0.0808 | 187.8584 | 63189.5391 | 1.3016 | 6.5082 | 76.826 | 9.68 | 10.5624 | 544435.75 |
|
74 |
+
| 4500 | 0.0909 | 185.7313 | 64230.4922 | 1.3024 | 6.5949 | 75.816 | 9.553 | 10.3738 | 557964.625 |
|
75 |
+
| 5000 | 0.1010 | 188.6459 | 64407.2109 | 1.3027 | 6.5061 | 76.851 | 9.683 | 10.5568 | 520155.6562 |
|
76 |
+
| 5500 | 0.1111 | 184.7412 | 58677.0 | 1.3028 | 6.5056 | 76.857 | 9.684 | 10.5502 | 500017.0312 |
|
77 |
+
| 6000 | 0.1212 | 187.4877 | 65491.2656 | 1.3021 | 6.4967 | 76.962 | 9.697 | 10.3348 | 554995.375 |
|
78 |
+
| 6500 | 0.1313 | 186.3582 | 64266.7031 | 1.3021 | 6.5693 | 76.112 | 9.59 | 10.3983 | 553221.4375 |
|
79 |
+
| 7000 | 0.1414 | 187.1901 | 63242.9531 | 1.3018 | 6.544 | 76.406 | 9.627 | 10.5450 | 506461.375 |
|
80 |
+
| 7500 | 0.1515 | 186.2933 | 59626.8398 | 1.3022 | 6.5016 | 76.904 | 9.69 | 10.5895 | 530102.25 |
|
81 |
+
| 8000 | 0.1616 | 186.3437 | 63546.5742 | 1.3018 | 6.5127 | 76.773 | 9.673 | 10.3340 | 530385.4375 |
|
82 |
+
| 8500 | 0.1717 | 185.1208 | 60319.5742 | 1.3025 | 6.4954 | 76.977 | 9.699 | 10.4469 | 510258.875 |
|
83 |
+
| 9000 | 0.1818 | 181.6476 | 60549.3633 | 1.3035 | 6.5012 | 76.909 | 9.691 | 10.2438 | 524055.8125 |
|
84 |
+
| 9500 | 0.1919 | 184.3409 | 58809.4336 | 1.3030 | 6.5437 | 76.409 | 9.628 | 10.4219 | 498418.7812 |
|
85 |
+
| 10000 | 0.2020 | 186.4449 | 62923.0820 | 1.3019 | 6.5079 | 76.83 | 9.681 | 10.4482 | 542695.5625 |
|
86 |
+
| 10500 | 0.2121 | 188.1788 | 63318.7539 | 1.3030 | 6.4961 | 76.969 | 9.698 | 10.5502 | 541827.25 |
|
87 |
+
| 11000 | 0.2222 | 187.6693 | 64289.3242 | 1.3024 | 6.5016 | 76.905 | 9.69 | 10.4733 | 553221.4375 |
|
88 |
+
| 11500 | 0.2323 | 185.5802 | 64248.625 | 1.3021 | 6.5073 | 76.837 | 9.681 | 10.2012 | 533080.6875 |
|
89 |
+
| 12000 | 0.2424 | 189.5834 | 63672.0234 | 1.3022 | 6.5265 | 76.611 | 9.653 | 10.5284 | 528407.875 |
|
90 |
+
| 12500 | 0.2525 | 180.2249 | 59601.6523 | 1.3024 | 6.5297 | 76.573 | 9.648 | 10.1570 | 495766.4062 |
|
91 |
+
| 13000 | 0.2626 | 182.8547 | 61321.8477 | 1.3034 | 6.5338 | 76.525 | 9.642 | 10.2544 | 525455.5625 |
|
92 |
+
| 13500 | 0.2727 | 187.0524 | 64284.8477 | 1.3028 | 6.4889 | 77.054 | 9.709 | 10.3267 | 553516.4375 |
|
93 |
+
| 14000 | 0.2828 | 185.0635 | 62042.9727 | 1.3031 | 6.4841 | 77.112 | 9.716 | 10.4310 | 506461.375 |
|
94 |
+
| 14500 | 0.2929 | 186.0697 | 62437.5156 | 1.3017 | 6.5099 | 76.806 | 9.678 | 10.3867 | 528267.3125 |
|
95 |
+
| 15000 | 0.3030 | 184.3052 | 59929.9531 | 1.3035 | 6.4988 | 76.937 | 9.694 | 10.4081 | 489719.125 |
|
96 |
+
| 15500 | 0.3131 | 186.6616 | 62604.8242 | 1.3016 | 6.4906 | 77.035 | 9.706 | 10.4435 | 530668.25 |
|
97 |
+
| 16000 | 0.3232 | 189.1726 | 65017.875 | 1.3019 | 6.4903 | 77.038 | 9.707 | 10.4482 | 533792.375 |
|
98 |
+
| 16500 | 0.3333 | 187.0235 | 61572.8906 | 1.3026 | 6.5061 | 76.851 | 9.683 | 10.3716 | 514908.6875 |
|
99 |
+
| 17000 | 0.3434 | 187.6693 | 63296.4727 | 1.3018 | 6.5151 | 76.745 | 9.67 | 10.4456 | 564854.625 |
|
100 |
+
| 17500 | 0.3535 | 186.0697 | 63887.6484 | 1.3026 | 6.5001 | 76.922 | 9.692 | 10.4984 | 529254.625 |
|
101 |
+
| 18000 | 0.3636 | 185.4366 | 61538.1992 | 1.3022 | 6.5727 | 76.072 | 9.585 | 10.4500 | 524475.3125 |
|
102 |
+
| 18500 | 0.3737 | 184.0769 | 60874.3516 | 1.3021 | 6.5196 | 76.692 | 9.663 | 10.2820 | 501887.875 |
|
103 |
+
| 19000 | 0.3838 | 187.3134 | 62834.5117 | 1.3021 | 6.5518 | 76.315 | 9.616 | 10.5154 | 515458.4688 |
|
104 |
+
| 19500 | 0.3939 | 187.0235 | 61851.0312 | 1.3027 | 6.5237 | 76.643 | 9.657 | 10.4323 | 531234.875 |
|
105 |
+
| 20000 | 0.4040 | 185.3360 | 59694.0898 | 1.3029 | 6.4861 | 77.088 | 9.713 | 10.4937 | 489719.125 |
|
106 |
+
| 20500 | 0.4141 | 185.0204 | 61746.5977 | 1.3032 | 6.4963 | 76.967 | 9.698 | 10.3011 | 531802.625 |
|
107 |
+
| 21000 | 0.4242 | 186.9511 | 65935.5312 | 1.3018 | 6.5055 | 76.858 | 9.684 | 10.3541 | 560351.5 |
|
108 |
+
| 21500 | 0.4343 | 184.4909 | 61642.2812 | 1.3018 | 6.4899 | 77.042 | 9.707 | 10.4206 | 499483.7188 |
|
109 |
+
| 22000 | 0.4444 | 186.8424 | 60728.7539 | 1.3014 | 6.4896 | 77.046 | 9.708 | 10.5957 | 501085.375 |
|
110 |
+
| 22500 | 0.4545 | 186.3437 | 60711.6719 | 1.3017 | 6.6171 | 75.561 | 9.521 | 10.4323 | 501620.375 |
|
111 |
+
| 23000 | 0.4646 | 184.1268 | 61494.8438 | 1.3016 | 6.5961 | 75.803 | 9.551 | 10.3293 | 515458.4688 |
|
112 |
+
| 23500 | 0.4747 | 187.1104 | 64284.8477 | 1.3023 | 6.4979 | 76.948 | 9.696 | 10.4508 | 528690.125 |
|
113 |
+
| 24000 | 0.4848 | 185.9689 | 62332.0352 | 1.3023 | 6.4867 | 77.08 | 9.712 | 10.3674 | 536362.1875 |
|
114 |
+
| 24500 | 0.4949 | 186.0769 | 62402.3359 | 1.3023 | 6.5117 | 76.785 | 9.675 | 10.3545 | 517940.0 |
|
115 |
+
| 25000 | 0.5051 | 185.3647 | 62587.2148 | 1.3016 | 6.4943 | 76.99 | 9.701 | 10.3721 | 528831.3125 |
|
116 |
+
| 25500 | 0.5152 | 186.1995 | 63011.7734 | 1.3026 | 6.515 | 76.746 | 9.67 | 10.4357 | 506461.375 |
|
117 |
+
| 26000 | 0.5253 | 187.0524 | 62799.1641 | 1.3022 | 6.5338 | 76.526 | 9.642 | 10.4137 | 506731.9375 |
|
118 |
+
| 26500 | 0.5354 | 186.1995 | 62657.7461 | 1.3022 | 6.5427 | 76.421 | 9.629 | 10.3879 | 500817.8125 |
|
119 |
+
| 27000 | 0.5455 | 186.0697 | 61999.3203 | 1.3021 | 6.4978 | 76.949 | 9.696 | 10.4167 | 514359.4688 |
|
120 |
+
| 27500 | 0.5556 | 187.0235 | 63296.4727 | 1.3026 | 6.4941 | 76.993 | 9.701 | 10.4297 | 520155.6562 |
|
121 |
+
| 28000 | 0.5657 | 187.0307 | 62112.9492 | 1.3019 | 6.491 | 77.03 | 9.706 | 10.4850 | 505651.5625 |
|
122 |
+
| 28500 | 0.5758 | 185.9976 | 62976.2695 | 1.3018 | 6.4918 | 77.021 | 9.705 | 10.3618 | 507814.625 |
|
123 |
+
| 29000 | 0.5859 | 187.1829 | 64575.2383 | 1.3024 | 6.4998 | 76.926 | 9.693 | 10.4586 | 538369.0625 |
|
124 |
+
| 29500 | 0.5960 | 186.5315 | 64176.2422 | 1.3023 | 6.5299 | 76.571 | 9.648 | 10.3652 | 549397.0 |
|
125 |
+
| 30000 | 0.6061 | 187.8438 | 62463.9023 | 1.3026 | 6.528 | 76.594 | 9.651 | 10.5163 | 528972.0625 |
|
126 |
+
| 30500 | 0.6162 | 186.2427 | 62025.5195 | 1.3021 | 6.5132 | 76.768 | 9.673 | 10.3738 | 520433.0 |
|
127 |
+
| 31000 | 0.6263 | 187.3425 | 63510.7656 | 1.3027 | 6.4955 | 76.976 | 9.699 | 10.4129 | 526297.625 |
|
128 |
+
| 31500 | 0.6364 | 186.8569 | 61964.3867 | 1.3025 | 6.4926 | 77.011 | 9.703 | 10.4850 | 507814.625 |
|
129 |
+
| 32000 | 0.6465 | 187.5240 | 63332.0977 | 1.3023 | 6.4876 | 77.07 | 9.711 | 10.4742 | 522101.8438 |
|
130 |
+
| 32500 | 0.6566 | 186.2861 | 63367.8047 | 1.3026 | 6.4849 | 77.102 | 9.715 | 10.3554 | 519323.4375 |
|
131 |
+
| 33000 | 0.6667 | 186.8931 | 63689.9375 | 1.3017 | 6.4831 | 77.124 | 9.718 | 10.3918 | 512715.4062 |
|
132 |
+
| 33500 | 0.6768 | 186.5604 | 62481.4766 | 1.3019 | 6.4927 | 77.009 | 9.703 | 10.4603 | 514084.8438 |
|
133 |
+
| 34000 | 0.6869 | 184.0769 | 61546.8828 | 1.3023 | 6.4923 | 77.015 | 9.704 | 10.2982 | 496560.4062 |
|
134 |
+
| 34500 | 0.6970 | 186.6327 | 63082.8438 | 1.3019 | 6.5074 | 76.835 | 9.681 | 10.4202 | 525455.5625 |
|
135 |
+
| 35000 | 0.7071 | 186.0409 | 62640.1211 | 1.3019 | 6.5011 | 76.91 | 9.691 | 10.3639 | 514084.8438 |
|
136 |
+
| 35500 | 0.7172 | 187.6039 | 63707.9219 | 1.3019 | 6.4981 | 76.946 | 9.695 | 10.4348 | 522380.75 |
|
137 |
+
| 36000 | 0.7273 | 186.3293 | 61503.5234 | 1.3017 | 6.5057 | 76.855 | 9.684 | 10.4405 | 502155.9688 |
|
138 |
+
| 36500 | 0.7374 | 185.9689 | 62104.1836 | 1.3017 | 6.5178 | 76.713 | 9.666 | 10.4500 | 502960.2188 |
|
139 |
+
| 37000 | 0.7475 | 187.1611 | 63672.0234 | 1.3019 | 6.5124 | 76.776 | 9.674 | 10.4625 | 526578.75 |
|
140 |
+
| 37500 | 0.7576 | 187.6329 | 64212.4258 | 1.3021 | 6.5257 | 76.62 | 9.654 | 10.4219 | 535504.0625 |
|
141 |
+
| 38000 | 0.7677 | 186.3005 | 63367.8047 | 1.3022 | 6.5192 | 76.696 | 9.664 | 10.3669 | 527703.875 |
|
142 |
+
| 38500 | 0.7778 | 186.4738 | 63618.2461 | 1.3023 | 6.5003 | 76.919 | 9.692 | 10.3712 | 526016.625 |
|
143 |
+
| 39000 | 0.7879 | 187.4296 | 63528.6992 | 1.3019 | 6.5045 | 76.87 | 9.686 | 10.4172 | 528267.3125 |
|
144 |
+
| 39500 | 0.7980 | 186.7194 | 63403.5312 | 1.3022 | 6.4969 | 76.96 | 9.697 | 10.3845 | 524895.125 |
|
145 |
+
| 40000 | 0.8081 | 187.0089 | 62905.3789 | 1.3020 | 6.5261 | 76.616 | 9.654 | 10.4500 | 521267.0 |
|
146 |
+
| 40500 | 0.8182 | 186.1706 | 62384.7812 | 1.3023 | 6.5159 | 76.735 | 9.669 | 10.3442 | 511895.0938 |
|
147 |
+
| 41000 | 0.8283 | 186.3582 | 62728.4141 | 1.3019 | 6.5084 | 76.824 | 9.68 | 10.4120 | 516835.75 |
|
148 |
+
| 41500 | 0.8384 | 187.4441 | 63995.7031 | 1.3019 | 6.5048 | 76.866 | 9.685 | 10.4211 | 529396.0 |
|
149 |
+
| 42000 | 0.8485 | 187.6329 | 64104.0039 | 1.3021 | 6.4997 | 76.926 | 9.693 | 10.4370 | 530102.25 |
|
150 |
+
| 42500 | 0.8586 | 186.6761 | 63296.4727 | 1.3021 | 6.5063 | 76.849 | 9.683 | 10.3721 | 520988.6875 |
|
151 |
+
| 43000 | 0.8687 | 187.4877 | 63582.3984 | 1.3020 | 6.4889 | 77.055 | 9.709 | 10.4103 | 523217.375 |
|
152 |
+
| 43500 | 0.8788 | 186.7484 | 63047.3008 | 1.3021 | 6.489 | 77.054 | 9.709 | 10.4034 | 521267.0 |
|
153 |
+
| 44000 | 0.8889 | 187.2409 | 62940.8438 | 1.3019 | 6.4957 | 76.974 | 9.699 | 10.4370 | 520711.0312 |
|
154 |
+
| 44500 | 0.8990 | 186.4449 | 62905.3789 | 1.3021 | 6.5115 | 76.787 | 9.675 | 10.3978 | 521544.9688 |
|
155 |
+
| 45000 | 0.9091 | 186.5315 | 63047.3008 | 1.3019 | 6.5135 | 76.764 | 9.672 | 10.4008 | 520711.0312 |
|
156 |
+
| 45500 | 0.9192 | 186.4449 | 62869.9336 | 1.3019 | 6.5176 | 76.715 | 9.666 | 10.4013 | 514633.75 |
|
157 |
+
| 46000 | 0.9293 | 186.5460 | 62799.1641 | 1.3019 | 6.505 | 76.864 | 9.685 | 10.4120 | 513810.875 |
|
158 |
+
| 46500 | 0.9394 | 186.4449 | 62551.9492 | 1.3020 | 6.5435 | 76.412 | 9.628 | 10.3922 | 513810.875 |
|
159 |
+
| 47000 | 0.9495 | 186.5315 | 62587.2148 | 1.3020 | 6.5419 | 76.431 | 9.63 | 10.4094 | 515458.4688 |
|
160 |
+
| 47500 | 0.9596 | 186.5460 | 62587.2148 | 1.3019 | 6.5054 | 76.859 | 9.684 | 10.4094 | 516008.8438 |
|
161 |
+
| 48000 | 0.9697 | 186.8352 | 62763.7812 | 1.3019 | 6.5119 | 76.783 | 9.675 | 10.4262 | 517111.3438 |
|
162 |
+
| 48500 | 0.9798 | 186.7339 | 62728.4141 | 1.3019 | 6.5272 | 76.603 | 9.652 | 10.4155 | 516008.8438 |
|
163 |
+
| 49000 | 0.9899 | 186.7918 | 62728.4141 | 1.3019 | 6.5054 | 76.86 | 9.684 | 10.4172 | 515733.8125 |
|
164 |
+
| 49500 | 1.0 | 186.7918 | 62728.4141 | 1.3019 | 6.5072 | 76.838 | 9.682 | 10.4215 | 515733.8125 |
|
165 |
|
166 |
### Framework versions
|
167 |
- Distily 0.2.0
|
logs/dropout=0, learning_rate=0.004, per_device_train_batch_size=1, weight_decay=0.001/events.out.tfevents.1723906923.5f530b1cf724
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2cf00ae38431dc17de9bdf10c5b94497dd15b92b6626d0caee0314f373893ff3
|
3 |
+
size 312
|