|
2023-10-11 03:16:41,684 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:16:41,686 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 03:16:41,686 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:16:41,686 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-11 03:16:41,686 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:16:41,687 Train: 1166 sentences |
|
2023-10-11 03:16:41,687 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 03:16:41,687 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:16:41,687 Training Params: |
|
2023-10-11 03:16:41,687 - learning_rate: "0.00015" |
|
2023-10-11 03:16:41,687 - mini_batch_size: "4" |
|
2023-10-11 03:16:41,687 - max_epochs: "10" |
|
2023-10-11 03:16:41,687 - shuffle: "True" |
|
2023-10-11 03:16:41,687 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:16:41,687 Plugins: |
|
2023-10-11 03:16:41,687 - TensorboardLogger |
|
2023-10-11 03:16:41,687 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 03:16:41,687 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:16:41,687 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 03:16:41,688 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 03:16:41,688 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:16:41,688 Computation: |
|
2023-10-11 03:16:41,688 - compute on device: cuda:0 |
|
2023-10-11 03:16:41,688 - embedding storage: none |
|
2023-10-11 03:16:41,688 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:16:41,688 Model training base path: "hmbench-newseye/fi-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-11 03:16:41,688 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:16:41,688 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:16:41,688 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 03:16:51,822 epoch 1 - iter 29/292 - loss 2.81977350 - time (sec): 10.13 - samples/sec: 425.59 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 03:17:01,805 epoch 1 - iter 58/292 - loss 2.80688640 - time (sec): 20.12 - samples/sec: 410.33 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 03:17:12,096 epoch 1 - iter 87/292 - loss 2.78279036 - time (sec): 30.41 - samples/sec: 401.92 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 03:17:23,906 epoch 1 - iter 116/292 - loss 2.71003813 - time (sec): 42.22 - samples/sec: 413.94 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 03:17:34,740 epoch 1 - iter 145/292 - loss 2.62475305 - time (sec): 53.05 - samples/sec: 411.03 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-11 03:17:45,468 epoch 1 - iter 174/292 - loss 2.52043100 - time (sec): 63.78 - samples/sec: 419.27 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 03:17:56,403 epoch 1 - iter 203/292 - loss 2.39946272 - time (sec): 74.71 - samples/sec: 432.12 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-11 03:18:06,391 epoch 1 - iter 232/292 - loss 2.29647806 - time (sec): 84.70 - samples/sec: 430.10 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 03:18:16,949 epoch 1 - iter 261/292 - loss 2.17952313 - time (sec): 95.26 - samples/sec: 427.15 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 03:18:26,723 epoch 1 - iter 290/292 - loss 2.07842445 - time (sec): 105.03 - samples/sec: 422.13 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 03:18:27,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:18:27,176 EPOCH 1 done: loss 2.0752 - lr: 0.000148 |
|
2023-10-11 03:18:32,996 DEV : loss 0.7023366689682007 - f1-score (micro avg) 0.0 |
|
2023-10-11 03:18:33,006 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:18:42,653 epoch 2 - iter 29/292 - loss 0.73892904 - time (sec): 9.65 - samples/sec: 453.80 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 03:18:52,419 epoch 2 - iter 58/292 - loss 0.67633155 - time (sec): 19.41 - samples/sec: 442.23 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-11 03:19:02,214 epoch 2 - iter 87/292 - loss 0.66210392 - time (sec): 29.21 - samples/sec: 434.94 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-11 03:19:12,942 epoch 2 - iter 116/292 - loss 0.60646101 - time (sec): 39.93 - samples/sec: 435.65 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 03:19:23,013 epoch 2 - iter 145/292 - loss 0.65710441 - time (sec): 50.00 - samples/sec: 445.08 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 03:19:33,309 epoch 2 - iter 174/292 - loss 0.62278384 - time (sec): 60.30 - samples/sec: 443.37 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-11 03:19:43,005 epoch 2 - iter 203/292 - loss 0.59076254 - time (sec): 70.00 - samples/sec: 446.10 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-11 03:19:51,530 epoch 2 - iter 232/292 - loss 0.56424704 - time (sec): 78.52 - samples/sec: 444.22 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 03:20:01,917 epoch 2 - iter 261/292 - loss 0.55417964 - time (sec): 88.91 - samples/sec: 438.36 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 03:20:13,521 epoch 2 - iter 290/292 - loss 0.53594965 - time (sec): 100.51 - samples/sec: 439.75 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 03:20:14,065 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:20:14,066 EPOCH 2 done: loss 0.5351 - lr: 0.000134 |
|
2023-10-11 03:20:20,195 DEV : loss 0.297545462846756 - f1-score (micro avg) 0.0 |
|
2023-10-11 03:20:20,205 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:20:30,000 epoch 3 - iter 29/292 - loss 0.37459476 - time (sec): 9.79 - samples/sec: 407.46 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 03:20:38,908 epoch 3 - iter 58/292 - loss 0.36871425 - time (sec): 18.70 - samples/sec: 425.66 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 03:20:48,697 epoch 3 - iter 87/292 - loss 0.34885003 - time (sec): 28.49 - samples/sec: 439.40 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 03:20:58,273 epoch 3 - iter 116/292 - loss 0.32808207 - time (sec): 38.06 - samples/sec: 451.47 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 03:21:07,349 epoch 3 - iter 145/292 - loss 0.32374212 - time (sec): 47.14 - samples/sec: 452.13 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 03:21:17,462 epoch 3 - iter 174/292 - loss 0.30495891 - time (sec): 57.25 - samples/sec: 455.18 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 03:21:27,598 epoch 3 - iter 203/292 - loss 0.32430122 - time (sec): 67.39 - samples/sec: 455.47 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-11 03:21:36,787 epoch 3 - iter 232/292 - loss 0.32165796 - time (sec): 76.58 - samples/sec: 452.27 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-11 03:21:47,217 epoch 3 - iter 261/292 - loss 0.31557249 - time (sec): 87.01 - samples/sec: 456.71 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 03:21:56,871 epoch 3 - iter 290/292 - loss 0.31177726 - time (sec): 96.66 - samples/sec: 456.37 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 03:21:57,461 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:21:57,461 EPOCH 3 done: loss 0.3101 - lr: 0.000117 |
|
2023-10-11 03:22:03,017 DEV : loss 0.2035694271326065 - f1-score (micro avg) 0.4298 |
|
2023-10-11 03:22:03,027 saving best model |
|
2023-10-11 03:22:04,096 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:22:14,038 epoch 4 - iter 29/292 - loss 0.20110119 - time (sec): 9.94 - samples/sec: 475.37 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-11 03:22:24,401 epoch 4 - iter 58/292 - loss 0.17826557 - time (sec): 20.30 - samples/sec: 495.26 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-11 03:22:34,413 epoch 4 - iter 87/292 - loss 0.20996155 - time (sec): 30.31 - samples/sec: 484.16 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 03:22:44,346 epoch 4 - iter 116/292 - loss 0.21692770 - time (sec): 40.25 - samples/sec: 482.16 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 03:22:53,827 epoch 4 - iter 145/292 - loss 0.21814470 - time (sec): 49.73 - samples/sec: 479.47 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-11 03:23:03,233 epoch 4 - iter 174/292 - loss 0.21089514 - time (sec): 59.14 - samples/sec: 474.44 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 03:23:12,483 epoch 4 - iter 203/292 - loss 0.21084487 - time (sec): 68.38 - samples/sec: 467.12 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 03:23:22,118 epoch 4 - iter 232/292 - loss 0.20482475 - time (sec): 78.02 - samples/sec: 463.58 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-11 03:23:30,759 epoch 4 - iter 261/292 - loss 0.20151559 - time (sec): 86.66 - samples/sec: 456.03 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-11 03:23:40,714 epoch 4 - iter 290/292 - loss 0.20188943 - time (sec): 96.62 - samples/sec: 458.75 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 03:23:41,130 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:23:41,131 EPOCH 4 done: loss 0.2023 - lr: 0.000100 |
|
2023-10-11 03:23:46,947 DEV : loss 0.15248891711235046 - f1-score (micro avg) 0.6021 |
|
2023-10-11 03:23:46,957 saving best model |
|
2023-10-11 03:23:49,630 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:23:59,366 epoch 5 - iter 29/292 - loss 0.15498313 - time (sec): 9.73 - samples/sec: 459.85 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 03:24:09,829 epoch 5 - iter 58/292 - loss 0.13333345 - time (sec): 20.19 - samples/sec: 473.04 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-11 03:24:20,042 epoch 5 - iter 87/292 - loss 0.14821866 - time (sec): 30.41 - samples/sec: 470.84 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 03:24:30,243 epoch 5 - iter 116/292 - loss 0.14730098 - time (sec): 40.61 - samples/sec: 470.32 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 03:24:40,156 epoch 5 - iter 145/292 - loss 0.15044218 - time (sec): 50.52 - samples/sec: 466.81 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-11 03:24:50,125 epoch 5 - iter 174/292 - loss 0.14812398 - time (sec): 60.49 - samples/sec: 460.27 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-11 03:24:59,576 epoch 5 - iter 203/292 - loss 0.14525037 - time (sec): 69.94 - samples/sec: 454.03 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 03:25:09,036 epoch 5 - iter 232/292 - loss 0.14314177 - time (sec): 79.40 - samples/sec: 446.58 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 03:25:19,457 epoch 5 - iter 261/292 - loss 0.13766737 - time (sec): 89.82 - samples/sec: 448.05 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 03:25:29,102 epoch 5 - iter 290/292 - loss 0.13520449 - time (sec): 99.47 - samples/sec: 445.86 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 03:25:29,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:25:29,549 EPOCH 5 done: loss 0.1354 - lr: 0.000084 |
|
2023-10-11 03:25:35,482 DEV : loss 0.12166187167167664 - f1-score (micro avg) 0.7455 |
|
2023-10-11 03:25:35,492 saving best model |
|
2023-10-11 03:25:38,127 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:25:47,622 epoch 6 - iter 29/292 - loss 0.08960265 - time (sec): 9.49 - samples/sec: 393.43 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 03:25:57,785 epoch 6 - iter 58/292 - loss 0.12767978 - time (sec): 19.65 - samples/sec: 396.46 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 03:26:08,293 epoch 6 - iter 87/292 - loss 0.11619278 - time (sec): 30.16 - samples/sec: 412.04 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 03:26:18,684 epoch 6 - iter 116/292 - loss 0.11102256 - time (sec): 40.55 - samples/sec: 410.65 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 03:26:29,492 epoch 6 - iter 145/292 - loss 0.10620087 - time (sec): 51.36 - samples/sec: 420.53 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 03:26:40,798 epoch 6 - iter 174/292 - loss 0.09868087 - time (sec): 62.67 - samples/sec: 424.26 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-11 03:26:50,496 epoch 6 - iter 203/292 - loss 0.10008914 - time (sec): 72.36 - samples/sec: 420.45 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-11 03:27:00,772 epoch 6 - iter 232/292 - loss 0.09776359 - time (sec): 82.64 - samples/sec: 412.02 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-11 03:27:12,362 epoch 6 - iter 261/292 - loss 0.09830523 - time (sec): 94.23 - samples/sec: 418.52 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 03:27:22,801 epoch 6 - iter 290/292 - loss 0.09807018 - time (sec): 104.67 - samples/sec: 421.15 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-11 03:27:23,552 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:27:23,553 EPOCH 6 done: loss 0.0974 - lr: 0.000067 |
|
2023-10-11 03:27:29,960 DEV : loss 0.11362636834383011 - f1-score (micro avg) 0.7778 |
|
2023-10-11 03:27:29,970 saving best model |
|
2023-10-11 03:27:32,668 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:27:42,352 epoch 7 - iter 29/292 - loss 0.07419722 - time (sec): 9.67 - samples/sec: 394.50 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-11 03:27:52,002 epoch 7 - iter 58/292 - loss 0.07402345 - time (sec): 19.32 - samples/sec: 410.05 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 03:28:01,047 epoch 7 - iter 87/292 - loss 0.07690943 - time (sec): 28.37 - samples/sec: 401.83 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 03:28:12,534 epoch 7 - iter 116/292 - loss 0.07959332 - time (sec): 39.85 - samples/sec: 427.33 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 03:28:22,410 epoch 7 - iter 145/292 - loss 0.07518449 - time (sec): 49.73 - samples/sec: 424.22 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 03:28:32,859 epoch 7 - iter 174/292 - loss 0.07895677 - time (sec): 60.18 - samples/sec: 419.98 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 03:28:42,965 epoch 7 - iter 203/292 - loss 0.07832199 - time (sec): 70.29 - samples/sec: 422.31 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 03:28:53,183 epoch 7 - iter 232/292 - loss 0.07419731 - time (sec): 80.50 - samples/sec: 428.64 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-11 03:29:03,235 epoch 7 - iter 261/292 - loss 0.07485255 - time (sec): 90.56 - samples/sec: 433.91 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 03:29:13,519 epoch 7 - iter 290/292 - loss 0.07487242 - time (sec): 100.84 - samples/sec: 438.77 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 03:29:14,009 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:29:14,010 EPOCH 7 done: loss 0.0749 - lr: 0.000050 |
|
2023-10-11 03:29:19,721 DEV : loss 0.1100909486413002 - f1-score (micro avg) 0.7894 |
|
2023-10-11 03:29:19,730 saving best model |
|
2023-10-11 03:29:22,380 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:29:32,069 epoch 8 - iter 29/292 - loss 0.06858685 - time (sec): 9.68 - samples/sec: 423.14 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 03:29:41,880 epoch 8 - iter 58/292 - loss 0.06385058 - time (sec): 19.50 - samples/sec: 433.12 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 03:29:51,908 epoch 8 - iter 87/292 - loss 0.06960465 - time (sec): 29.52 - samples/sec: 433.10 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 03:30:03,134 epoch 8 - iter 116/292 - loss 0.07164164 - time (sec): 40.75 - samples/sec: 437.11 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 03:30:13,585 epoch 8 - iter 145/292 - loss 0.07036557 - time (sec): 51.20 - samples/sec: 432.87 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-11 03:30:22,910 epoch 8 - iter 174/292 - loss 0.06603532 - time (sec): 60.53 - samples/sec: 427.80 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-11 03:30:33,264 epoch 8 - iter 203/292 - loss 0.06513753 - time (sec): 70.88 - samples/sec: 433.78 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 03:30:43,326 epoch 8 - iter 232/292 - loss 0.06712616 - time (sec): 80.94 - samples/sec: 436.36 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 03:30:52,519 epoch 8 - iter 261/292 - loss 0.06210932 - time (sec): 90.13 - samples/sec: 434.89 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-11 03:31:02,720 epoch 8 - iter 290/292 - loss 0.05877080 - time (sec): 100.34 - samples/sec: 440.52 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 03:31:03,298 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:31:03,298 EPOCH 8 done: loss 0.0588 - lr: 0.000034 |
|
2023-10-11 03:31:08,969 DEV : loss 0.11329486221075058 - f1-score (micro avg) 0.7913 |
|
2023-10-11 03:31:08,978 saving best model |
|
2023-10-11 03:31:11,608 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:31:21,665 epoch 9 - iter 29/292 - loss 0.04382080 - time (sec): 10.05 - samples/sec: 457.28 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 03:31:30,877 epoch 9 - iter 58/292 - loss 0.05356168 - time (sec): 19.26 - samples/sec: 427.78 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 03:31:40,507 epoch 9 - iter 87/292 - loss 0.05086497 - time (sec): 28.89 - samples/sec: 429.42 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 03:31:50,143 epoch 9 - iter 116/292 - loss 0.05195064 - time (sec): 38.53 - samples/sec: 426.77 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 03:32:00,316 epoch 9 - iter 145/292 - loss 0.05449964 - time (sec): 48.70 - samples/sec: 441.65 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 03:32:09,886 epoch 9 - iter 174/292 - loss 0.05152749 - time (sec): 58.27 - samples/sec: 448.59 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-11 03:32:19,473 epoch 9 - iter 203/292 - loss 0.04877327 - time (sec): 67.86 - samples/sec: 452.81 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-11 03:32:29,586 epoch 9 - iter 232/292 - loss 0.04997456 - time (sec): 77.97 - samples/sec: 459.62 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 03:32:38,528 epoch 9 - iter 261/292 - loss 0.04881578 - time (sec): 86.92 - samples/sec: 459.93 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-11 03:32:47,820 epoch 9 - iter 290/292 - loss 0.04899412 - time (sec): 96.21 - samples/sec: 460.02 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-11 03:32:48,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:32:48,264 EPOCH 9 done: loss 0.0489 - lr: 0.000017 |
|
2023-10-11 03:32:53,792 DEV : loss 0.11269763857126236 - f1-score (micro avg) 0.7904 |
|
2023-10-11 03:32:53,801 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:33:02,891 epoch 10 - iter 29/292 - loss 0.04691422 - time (sec): 9.09 - samples/sec: 463.71 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 03:33:12,045 epoch 10 - iter 58/292 - loss 0.05143091 - time (sec): 18.24 - samples/sec: 457.69 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 03:33:22,171 epoch 10 - iter 87/292 - loss 0.04523086 - time (sec): 28.37 - samples/sec: 465.70 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 03:33:31,686 epoch 10 - iter 116/292 - loss 0.04088964 - time (sec): 37.88 - samples/sec: 465.65 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-11 03:33:41,168 epoch 10 - iter 145/292 - loss 0.04323243 - time (sec): 47.36 - samples/sec: 465.18 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-11 03:33:50,228 epoch 10 - iter 174/292 - loss 0.04318784 - time (sec): 56.43 - samples/sec: 457.83 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 03:34:00,927 epoch 10 - iter 203/292 - loss 0.04500761 - time (sec): 67.12 - samples/sec: 461.65 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 03:34:10,593 epoch 10 - iter 232/292 - loss 0.04463928 - time (sec): 76.79 - samples/sec: 458.99 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-11 03:34:20,237 epoch 10 - iter 261/292 - loss 0.04516349 - time (sec): 86.43 - samples/sec: 460.74 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 03:34:29,953 epoch 10 - iter 290/292 - loss 0.04578337 - time (sec): 96.15 - samples/sec: 457.82 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 03:34:30,663 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:34:30,664 EPOCH 10 done: loss 0.0454 - lr: 0.000000 |
|
2023-10-11 03:34:36,393 DEV : loss 0.11296340078115463 - f1-score (micro avg) 0.7913 |
|
2023-10-11 03:34:37,317 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:34:37,318 Loading model from best epoch ... |
|
2023-10-11 03:34:41,397 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-11 03:34:54,193 |
|
Results: |
|
- F-score (micro) 0.7433 |
|
- F-score (macro) 0.6945 |
|
- Accuracy 0.6094 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8065 0.8621 0.8333 348 |
|
LOC 0.6047 0.7854 0.6833 261 |
|
ORG 0.4255 0.3846 0.4040 52 |
|
HumanProd 0.9000 0.8182 0.8571 22 |
|
|
|
micro avg 0.6979 0.7950 0.7433 683 |
|
macro avg 0.6842 0.7126 0.6945 683 |
|
weighted avg 0.7034 0.7950 0.7441 683 |
|
|
|
2023-10-11 03:34:54,193 ---------------------------------------------------------------------------------------------------- |
|
|