|
2023-10-11 09:39:50,455 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:39:50,457 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 09:39:50,457 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:39:50,458 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-11 09:39:50,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:39:50,458 Train: 1085 sentences |
|
2023-10-11 09:39:50,458 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 09:39:50,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:39:50,458 Training Params: |
|
2023-10-11 09:39:50,458 - learning_rate: "0.00015" |
|
2023-10-11 09:39:50,458 - mini_batch_size: "4" |
|
2023-10-11 09:39:50,458 - max_epochs: "10" |
|
2023-10-11 09:39:50,458 - shuffle: "True" |
|
2023-10-11 09:39:50,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:39:50,459 Plugins: |
|
2023-10-11 09:39:50,459 - TensorboardLogger |
|
2023-10-11 09:39:50,459 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 09:39:50,459 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:39:50,459 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 09:39:50,459 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 09:39:50,459 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:39:50,459 Computation: |
|
2023-10-11 09:39:50,459 - compute on device: cuda:0 |
|
2023-10-11 09:39:50,459 - embedding storage: none |
|
2023-10-11 09:39:50,459 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:39:50,459 Model training base path: "hmbench-newseye/sv-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-11 09:39:50,459 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:39:50,459 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:39:50,460 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 09:40:00,360 epoch 1 - iter 27/272 - loss 2.84978598 - time (sec): 9.90 - samples/sec: 546.87 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 09:40:09,671 epoch 1 - iter 54/272 - loss 2.83993499 - time (sec): 19.21 - samples/sec: 513.66 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 09:40:19,438 epoch 1 - iter 81/272 - loss 2.82028870 - time (sec): 28.98 - samples/sec: 522.42 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 09:40:29,609 epoch 1 - iter 108/272 - loss 2.75600317 - time (sec): 39.15 - samples/sec: 536.66 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 09:40:39,237 epoch 1 - iter 135/272 - loss 2.66472267 - time (sec): 48.78 - samples/sec: 540.07 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-11 09:40:48,005 epoch 1 - iter 162/272 - loss 2.58070954 - time (sec): 57.54 - samples/sec: 532.22 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 09:40:57,344 epoch 1 - iter 189/272 - loss 2.47427847 - time (sec): 66.88 - samples/sec: 532.94 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-11 09:41:06,659 epoch 1 - iter 216/272 - loss 2.36367265 - time (sec): 76.20 - samples/sec: 534.61 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 09:41:17,028 epoch 1 - iter 243/272 - loss 2.22144061 - time (sec): 86.57 - samples/sec: 539.03 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-11 09:41:26,523 epoch 1 - iter 270/272 - loss 2.10748412 - time (sec): 96.06 - samples/sec: 540.43 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 09:41:26,853 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:41:26,853 EPOCH 1 done: loss 2.1060 - lr: 0.000148 |
|
2023-10-11 09:41:32,029 DEV : loss 0.8020414710044861 - f1-score (micro avg) 0.0 |
|
2023-10-11 09:41:32,037 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:41:42,064 epoch 2 - iter 27/272 - loss 0.76893037 - time (sec): 10.03 - samples/sec: 577.84 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 09:41:51,392 epoch 2 - iter 54/272 - loss 0.69007728 - time (sec): 19.35 - samples/sec: 556.09 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-11 09:42:01,289 epoch 2 - iter 81/272 - loss 0.67431997 - time (sec): 29.25 - samples/sec: 568.69 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-11 09:42:10,880 epoch 2 - iter 108/272 - loss 0.63443527 - time (sec): 38.84 - samples/sec: 561.67 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 09:42:20,670 epoch 2 - iter 135/272 - loss 0.61291708 - time (sec): 48.63 - samples/sec: 554.52 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 09:42:29,890 epoch 2 - iter 162/272 - loss 0.58952666 - time (sec): 57.85 - samples/sec: 542.86 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-11 09:42:39,479 epoch 2 - iter 189/272 - loss 0.57453427 - time (sec): 67.44 - samples/sec: 534.95 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-11 09:42:49,535 epoch 2 - iter 216/272 - loss 0.54891050 - time (sec): 77.50 - samples/sec: 533.83 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 09:42:59,160 epoch 2 - iter 243/272 - loss 0.53267587 - time (sec): 87.12 - samples/sec: 531.19 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 09:43:09,206 epoch 2 - iter 270/272 - loss 0.52022697 - time (sec): 97.17 - samples/sec: 533.36 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 09:43:09,606 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:43:09,606 EPOCH 2 done: loss 0.5193 - lr: 0.000134 |
|
2023-10-11 09:43:15,513 DEV : loss 0.3020303547382355 - f1-score (micro avg) 0.2903 |
|
2023-10-11 09:43:15,522 saving best model |
|
2023-10-11 09:43:16,375 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:43:25,815 epoch 3 - iter 27/272 - loss 0.38410607 - time (sec): 9.44 - samples/sec: 557.12 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 09:43:35,064 epoch 3 - iter 54/272 - loss 0.37333646 - time (sec): 18.69 - samples/sec: 549.52 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 09:43:44,340 epoch 3 - iter 81/272 - loss 0.34926921 - time (sec): 27.96 - samples/sec: 543.47 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 09:43:53,690 epoch 3 - iter 108/272 - loss 0.34301949 - time (sec): 37.31 - samples/sec: 544.64 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 09:44:03,560 epoch 3 - iter 135/272 - loss 0.33694811 - time (sec): 47.18 - samples/sec: 551.32 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 09:44:12,886 epoch 3 - iter 162/272 - loss 0.32617109 - time (sec): 56.51 - samples/sec: 548.75 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 09:44:23,467 epoch 3 - iter 189/272 - loss 0.32458657 - time (sec): 67.09 - samples/sec: 555.56 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-11 09:44:33,403 epoch 3 - iter 216/272 - loss 0.31370703 - time (sec): 77.03 - samples/sec: 554.78 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-11 09:44:42,131 epoch 3 - iter 243/272 - loss 0.31287050 - time (sec): 85.75 - samples/sec: 546.57 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 09:44:51,444 epoch 3 - iter 270/272 - loss 0.31168254 - time (sec): 95.07 - samples/sec: 544.09 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 09:44:51,938 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:44:51,938 EPOCH 3 done: loss 0.3122 - lr: 0.000117 |
|
2023-10-11 09:44:57,957 DEV : loss 0.2577632665634155 - f1-score (micro avg) 0.3305 |
|
2023-10-11 09:44:57,965 saving best model |
|
2023-10-11 09:45:00,499 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:45:09,969 epoch 4 - iter 27/272 - loss 0.29354878 - time (sec): 9.47 - samples/sec: 530.04 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-11 09:45:18,936 epoch 4 - iter 54/272 - loss 0.26285013 - time (sec): 18.43 - samples/sec: 518.92 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-11 09:45:28,972 epoch 4 - iter 81/272 - loss 0.24810936 - time (sec): 28.47 - samples/sec: 548.02 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 09:45:38,709 epoch 4 - iter 108/272 - loss 0.24358018 - time (sec): 38.21 - samples/sec: 552.25 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 09:45:47,860 epoch 4 - iter 135/272 - loss 0.23901215 - time (sec): 47.36 - samples/sec: 549.15 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-11 09:45:57,856 epoch 4 - iter 162/272 - loss 0.23297876 - time (sec): 57.35 - samples/sec: 551.33 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 09:46:06,942 epoch 4 - iter 189/272 - loss 0.23528726 - time (sec): 66.44 - samples/sec: 546.24 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 09:46:16,464 epoch 4 - iter 216/272 - loss 0.23243463 - time (sec): 75.96 - samples/sec: 545.52 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-11 09:46:26,378 epoch 4 - iter 243/272 - loss 0.23708393 - time (sec): 85.87 - samples/sec: 543.87 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-11 09:46:35,978 epoch 4 - iter 270/272 - loss 0.23306336 - time (sec): 95.47 - samples/sec: 542.45 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 09:46:36,427 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:46:36,428 EPOCH 4 done: loss 0.2329 - lr: 0.000100 |
|
2023-10-11 09:46:42,295 DEV : loss 0.19623495638370514 - f1-score (micro avg) 0.5471 |
|
2023-10-11 09:46:42,304 saving best model |
|
2023-10-11 09:46:44,891 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:46:54,037 epoch 5 - iter 27/272 - loss 0.18848352 - time (sec): 9.14 - samples/sec: 510.71 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 09:47:03,653 epoch 5 - iter 54/272 - loss 0.17630539 - time (sec): 18.76 - samples/sec: 535.19 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-11 09:47:13,136 epoch 5 - iter 81/272 - loss 0.16835017 - time (sec): 28.24 - samples/sec: 543.11 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 09:47:22,285 epoch 5 - iter 108/272 - loss 0.16954062 - time (sec): 37.39 - samples/sec: 541.39 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 09:47:31,421 epoch 5 - iter 135/272 - loss 0.15990292 - time (sec): 46.53 - samples/sec: 537.48 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-11 09:47:41,467 epoch 5 - iter 162/272 - loss 0.15873693 - time (sec): 56.57 - samples/sec: 546.17 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-11 09:47:50,820 epoch 5 - iter 189/272 - loss 0.16239154 - time (sec): 65.93 - samples/sec: 544.54 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-11 09:48:00,587 epoch 5 - iter 216/272 - loss 0.16394956 - time (sec): 75.69 - samples/sec: 545.05 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 09:48:10,258 epoch 5 - iter 243/272 - loss 0.16439694 - time (sec): 85.36 - samples/sec: 547.01 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 09:48:19,506 epoch 5 - iter 270/272 - loss 0.16219348 - time (sec): 94.61 - samples/sec: 546.56 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 09:48:20,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:48:20,009 EPOCH 5 done: loss 0.1620 - lr: 0.000084 |
|
2023-10-11 09:48:25,523 DEV : loss 0.16873042285442352 - f1-score (micro avg) 0.6128 |
|
2023-10-11 09:48:25,531 saving best model |
|
2023-10-11 09:48:28,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:48:37,600 epoch 6 - iter 27/272 - loss 0.14413262 - time (sec): 9.52 - samples/sec: 575.59 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 09:48:46,412 epoch 6 - iter 54/272 - loss 0.14208677 - time (sec): 18.33 - samples/sec: 556.61 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 09:48:55,754 epoch 6 - iter 81/272 - loss 0.13479681 - time (sec): 27.68 - samples/sec: 563.64 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 09:49:05,615 epoch 6 - iter 108/272 - loss 0.13231184 - time (sec): 37.54 - samples/sec: 572.69 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 09:49:14,856 epoch 6 - iter 135/272 - loss 0.12931757 - time (sec): 46.78 - samples/sec: 553.45 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 09:49:24,535 epoch 6 - iter 162/272 - loss 0.12321844 - time (sec): 56.46 - samples/sec: 558.64 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-11 09:49:33,780 epoch 6 - iter 189/272 - loss 0.11952690 - time (sec): 65.70 - samples/sec: 557.41 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-11 09:49:42,914 epoch 6 - iter 216/272 - loss 0.12316648 - time (sec): 74.84 - samples/sec: 556.15 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-11 09:49:52,434 epoch 6 - iter 243/272 - loss 0.12004095 - time (sec): 84.36 - samples/sec: 555.46 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 09:50:01,550 epoch 6 - iter 270/272 - loss 0.12026909 - time (sec): 93.47 - samples/sec: 553.33 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-11 09:50:02,085 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:50:02,085 EPOCH 6 done: loss 0.1200 - lr: 0.000067 |
|
2023-10-11 09:50:07,714 DEV : loss 0.1488361954689026 - f1-score (micro avg) 0.6112 |
|
2023-10-11 09:50:07,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:50:17,152 epoch 7 - iter 27/272 - loss 0.09847727 - time (sec): 9.43 - samples/sec: 511.97 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-11 09:50:27,127 epoch 7 - iter 54/272 - loss 0.09016802 - time (sec): 19.40 - samples/sec: 541.04 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 09:50:37,100 epoch 7 - iter 81/272 - loss 0.08601675 - time (sec): 29.38 - samples/sec: 544.14 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 09:50:46,118 epoch 7 - iter 108/272 - loss 0.08647169 - time (sec): 38.39 - samples/sec: 549.45 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 09:50:55,554 epoch 7 - iter 135/272 - loss 0.09090164 - time (sec): 47.83 - samples/sec: 554.50 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-11 09:51:04,902 epoch 7 - iter 162/272 - loss 0.08989021 - time (sec): 57.18 - samples/sec: 549.21 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 09:51:13,584 epoch 7 - iter 189/272 - loss 0.08994632 - time (sec): 65.86 - samples/sec: 543.30 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 09:51:23,911 epoch 7 - iter 216/272 - loss 0.08845138 - time (sec): 76.19 - samples/sec: 544.27 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-11 09:51:34,418 epoch 7 - iter 243/272 - loss 0.09342558 - time (sec): 86.69 - samples/sec: 534.59 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 09:51:45,463 epoch 7 - iter 270/272 - loss 0.09226835 - time (sec): 97.74 - samples/sec: 529.60 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 09:51:46,024 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:51:46,025 EPOCH 7 done: loss 0.0920 - lr: 0.000050 |
|
2023-10-11 09:51:51,630 DEV : loss 0.14688394963741302 - f1-score (micro avg) 0.6654 |
|
2023-10-11 09:51:51,638 saving best model |
|
2023-10-11 09:51:54,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:52:03,922 epoch 8 - iter 27/272 - loss 0.08937485 - time (sec): 9.76 - samples/sec: 470.65 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 09:52:13,039 epoch 8 - iter 54/272 - loss 0.07136459 - time (sec): 18.88 - samples/sec: 497.63 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 09:52:23,152 epoch 8 - iter 81/272 - loss 0.07727213 - time (sec): 28.99 - samples/sec: 530.55 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 09:52:32,583 epoch 8 - iter 108/272 - loss 0.07738875 - time (sec): 38.42 - samples/sec: 538.24 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-11 09:52:41,991 epoch 8 - iter 135/272 - loss 0.07911098 - time (sec): 47.83 - samples/sec: 541.09 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-11 09:52:50,431 epoch 8 - iter 162/272 - loss 0.08340609 - time (sec): 56.27 - samples/sec: 533.28 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-11 09:53:00,002 epoch 8 - iter 189/272 - loss 0.08129659 - time (sec): 65.84 - samples/sec: 541.02 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-11 09:53:10,477 epoch 8 - iter 216/272 - loss 0.07877295 - time (sec): 76.32 - samples/sec: 551.20 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 09:53:19,525 epoch 8 - iter 243/272 - loss 0.07667042 - time (sec): 85.36 - samples/sec: 548.01 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-11 09:53:29,019 epoch 8 - iter 270/272 - loss 0.07426787 - time (sec): 94.86 - samples/sec: 546.00 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 09:53:29,435 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:53:29,435 EPOCH 8 done: loss 0.0742 - lr: 0.000034 |
|
2023-10-11 09:53:34,876 DEV : loss 0.14787980914115906 - f1-score (micro avg) 0.7432 |
|
2023-10-11 09:53:34,884 saving best model |
|
2023-10-11 09:53:37,399 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:53:46,140 epoch 9 - iter 27/272 - loss 0.09102105 - time (sec): 8.74 - samples/sec: 491.25 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 09:53:55,379 epoch 9 - iter 54/272 - loss 0.08935916 - time (sec): 17.98 - samples/sec: 526.14 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 09:54:04,769 epoch 9 - iter 81/272 - loss 0.07481512 - time (sec): 27.37 - samples/sec: 529.16 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 09:54:14,186 epoch 9 - iter 108/272 - loss 0.07867968 - time (sec): 36.78 - samples/sec: 531.27 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 09:54:22,976 epoch 9 - iter 135/272 - loss 0.07834757 - time (sec): 45.57 - samples/sec: 525.22 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 09:54:33,025 epoch 9 - iter 162/272 - loss 0.07369543 - time (sec): 55.62 - samples/sec: 537.67 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 09:54:42,330 epoch 9 - iter 189/272 - loss 0.07182289 - time (sec): 64.93 - samples/sec: 539.05 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-11 09:54:52,050 epoch 9 - iter 216/272 - loss 0.06958838 - time (sec): 74.65 - samples/sec: 540.33 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 09:55:02,157 epoch 9 - iter 243/272 - loss 0.06633126 - time (sec): 84.75 - samples/sec: 545.83 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-11 09:55:11,700 epoch 9 - iter 270/272 - loss 0.06401853 - time (sec): 94.30 - samples/sec: 548.89 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-11 09:55:12,130 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:55:12,130 EPOCH 9 done: loss 0.0641 - lr: 0.000017 |
|
2023-10-11 09:55:17,955 DEV : loss 0.14741504192352295 - f1-score (micro avg) 0.7505 |
|
2023-10-11 09:55:17,963 saving best model |
|
2023-10-11 09:55:20,524 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:55:29,829 epoch 10 - iter 27/272 - loss 0.07188521 - time (sec): 9.30 - samples/sec: 575.75 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 09:55:38,625 epoch 10 - iter 54/272 - loss 0.07716434 - time (sec): 18.10 - samples/sec: 554.04 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 09:55:47,519 epoch 10 - iter 81/272 - loss 0.06791166 - time (sec): 26.99 - samples/sec: 553.49 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 09:55:56,393 epoch 10 - iter 108/272 - loss 0.06458565 - time (sec): 35.86 - samples/sec: 549.24 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-11 09:56:06,420 epoch 10 - iter 135/272 - loss 0.06332167 - time (sec): 45.89 - samples/sec: 565.03 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-11 09:56:15,810 epoch 10 - iter 162/272 - loss 0.05986959 - time (sec): 55.28 - samples/sec: 557.44 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 09:56:25,615 epoch 10 - iter 189/272 - loss 0.05919230 - time (sec): 65.09 - samples/sec: 553.40 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 09:56:35,088 epoch 10 - iter 216/272 - loss 0.05792051 - time (sec): 74.56 - samples/sec: 555.55 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-11 09:56:45,050 epoch 10 - iter 243/272 - loss 0.05609555 - time (sec): 84.52 - samples/sec: 555.36 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 09:56:54,243 epoch 10 - iter 270/272 - loss 0.05776757 - time (sec): 93.71 - samples/sec: 551.58 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 09:56:54,775 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:56:54,775 EPOCH 10 done: loss 0.0576 - lr: 0.000000 |
|
2023-10-11 09:57:00,361 DEV : loss 0.14719465374946594 - f1-score (micro avg) 0.7401 |
|
2023-10-11 09:57:01,192 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:57:01,194 Loading model from best epoch ... |
|
2023-10-11 09:57:05,952 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-11 09:57:18,000 |
|
Results: |
|
- F-score (micro) 0.7087 |
|
- F-score (macro) 0.6243 |
|
- Accuracy 0.5811 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6899 0.8558 0.7639 312 |
|
PER 0.6667 0.7212 0.6928 208 |
|
ORG 0.5263 0.3636 0.4301 55 |
|
HumanProd 0.4865 0.8182 0.6102 22 |
|
|
|
micro avg 0.6623 0.7621 0.7087 597 |
|
macro avg 0.5923 0.6897 0.6243 597 |
|
weighted avg 0.6593 0.7621 0.7028 597 |
|
|
|
2023-10-11 09:57:18,000 ---------------------------------------------------------------------------------------------------- |
|
|