stefan-it's picture
Upload folder using huggingface_hub
fd2e6a8
2023-10-06 10:36:36,735 ----------------------------------------------------------------------------------------------------
2023-10-06 10:36:36,737 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-06 10:36:36,737 ----------------------------------------------------------------------------------------------------
2023-10-06 10:36:36,737 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-06 10:36:36,737 ----------------------------------------------------------------------------------------------------
2023-10-06 10:36:36,737 Train: 1214 sentences
2023-10-06 10:36:36,737 (train_with_dev=False, train_with_test=False)
2023-10-06 10:36:36,737 ----------------------------------------------------------------------------------------------------
2023-10-06 10:36:36,737 Training Params:
2023-10-06 10:36:36,737 - learning_rate: "0.00016"
2023-10-06 10:36:36,737 - mini_batch_size: "8"
2023-10-06 10:36:36,738 - max_epochs: "10"
2023-10-06 10:36:36,738 - shuffle: "True"
2023-10-06 10:36:36,738 ----------------------------------------------------------------------------------------------------
2023-10-06 10:36:36,738 Plugins:
2023-10-06 10:36:36,738 - TensorboardLogger
2023-10-06 10:36:36,738 - LinearScheduler | warmup_fraction: '0.1'
2023-10-06 10:36:36,738 ----------------------------------------------------------------------------------------------------
2023-10-06 10:36:36,738 Final evaluation on model from best epoch (best-model.pt)
2023-10-06 10:36:36,738 - metric: "('micro avg', 'f1-score')"
2023-10-06 10:36:36,738 ----------------------------------------------------------------------------------------------------
2023-10-06 10:36:36,738 Computation:
2023-10-06 10:36:36,738 - compute on device: cuda:0
2023-10-06 10:36:36,738 - embedding storage: none
2023-10-06 10:36:36,738 ----------------------------------------------------------------------------------------------------
2023-10-06 10:36:36,738 Model training base path: "hmbench-ajmc/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-1"
2023-10-06 10:36:36,738 ----------------------------------------------------------------------------------------------------
2023-10-06 10:36:36,738 ----------------------------------------------------------------------------------------------------
2023-10-06 10:36:36,738 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-06 10:36:47,687 epoch 1 - iter 15/152 - loss 3.23077494 - time (sec): 10.95 - samples/sec: 313.78 - lr: 0.000015 - momentum: 0.000000
2023-10-06 10:36:58,305 epoch 1 - iter 30/152 - loss 3.22454695 - time (sec): 21.56 - samples/sec: 312.64 - lr: 0.000031 - momentum: 0.000000
2023-10-06 10:37:08,595 epoch 1 - iter 45/152 - loss 3.21420746 - time (sec): 31.85 - samples/sec: 305.70 - lr: 0.000046 - momentum: 0.000000
2023-10-06 10:37:18,994 epoch 1 - iter 60/152 - loss 3.19424225 - time (sec): 42.25 - samples/sec: 302.24 - lr: 0.000062 - momentum: 0.000000
2023-10-06 10:37:28,705 epoch 1 - iter 75/152 - loss 3.15403999 - time (sec): 51.97 - samples/sec: 299.03 - lr: 0.000078 - momentum: 0.000000
2023-10-06 10:37:38,390 epoch 1 - iter 90/152 - loss 3.08972177 - time (sec): 61.65 - samples/sec: 295.56 - lr: 0.000094 - momentum: 0.000000
2023-10-06 10:37:48,210 epoch 1 - iter 105/152 - loss 3.00602256 - time (sec): 71.47 - samples/sec: 295.45 - lr: 0.000109 - momentum: 0.000000
2023-10-06 10:37:58,330 epoch 1 - iter 120/152 - loss 2.90945200 - time (sec): 81.59 - samples/sec: 295.44 - lr: 0.000125 - momentum: 0.000000
2023-10-06 10:38:08,943 epoch 1 - iter 135/152 - loss 2.80389987 - time (sec): 92.20 - samples/sec: 294.96 - lr: 0.000141 - momentum: 0.000000
2023-10-06 10:38:20,063 epoch 1 - iter 150/152 - loss 2.68417092 - time (sec): 103.32 - samples/sec: 297.09 - lr: 0.000157 - momentum: 0.000000
2023-10-06 10:38:21,189 ----------------------------------------------------------------------------------------------------
2023-10-06 10:38:21,189 EPOCH 1 done: loss 2.6745 - lr: 0.000157
2023-10-06 10:38:28,166 DEV : loss 1.5482313632965088 - f1-score (micro avg) 0.0
2023-10-06 10:38:28,175 ----------------------------------------------------------------------------------------------------
2023-10-06 10:38:38,485 epoch 2 - iter 15/152 - loss 1.48139355 - time (sec): 10.31 - samples/sec: 298.18 - lr: 0.000158 - momentum: 0.000000
2023-10-06 10:38:48,691 epoch 2 - iter 30/152 - loss 1.35369894 - time (sec): 20.51 - samples/sec: 294.42 - lr: 0.000157 - momentum: 0.000000
2023-10-06 10:38:59,298 epoch 2 - iter 45/152 - loss 1.24409791 - time (sec): 31.12 - samples/sec: 292.17 - lr: 0.000155 - momentum: 0.000000
2023-10-06 10:39:10,239 epoch 2 - iter 60/152 - loss 1.16139545 - time (sec): 42.06 - samples/sec: 298.13 - lr: 0.000153 - momentum: 0.000000
2023-10-06 10:39:19,980 epoch 2 - iter 75/152 - loss 1.09378766 - time (sec): 51.80 - samples/sec: 295.94 - lr: 0.000151 - momentum: 0.000000
2023-10-06 10:39:30,460 epoch 2 - iter 90/152 - loss 1.01498657 - time (sec): 62.28 - samples/sec: 295.85 - lr: 0.000150 - momentum: 0.000000
2023-10-06 10:39:41,231 epoch 2 - iter 105/152 - loss 0.95207140 - time (sec): 73.05 - samples/sec: 292.34 - lr: 0.000148 - momentum: 0.000000
2023-10-06 10:39:52,183 epoch 2 - iter 120/152 - loss 0.91102583 - time (sec): 84.01 - samples/sec: 292.27 - lr: 0.000146 - momentum: 0.000000
2023-10-06 10:40:02,908 epoch 2 - iter 135/152 - loss 0.88786100 - time (sec): 94.73 - samples/sec: 292.37 - lr: 0.000144 - momentum: 0.000000
2023-10-06 10:40:13,063 epoch 2 - iter 150/152 - loss 0.85409298 - time (sec): 104.89 - samples/sec: 292.08 - lr: 0.000143 - momentum: 0.000000
2023-10-06 10:40:14,328 ----------------------------------------------------------------------------------------------------
2023-10-06 10:40:14,329 EPOCH 2 done: loss 0.8485 - lr: 0.000143
2023-10-06 10:40:21,610 DEV : loss 0.5165471434593201 - f1-score (micro avg) 0.0
2023-10-06 10:40:21,618 ----------------------------------------------------------------------------------------------------
2023-10-06 10:40:32,342 epoch 3 - iter 15/152 - loss 0.44320415 - time (sec): 10.72 - samples/sec: 298.44 - lr: 0.000141 - momentum: 0.000000
2023-10-06 10:40:43,307 epoch 3 - iter 30/152 - loss 0.41138517 - time (sec): 21.69 - samples/sec: 295.16 - lr: 0.000139 - momentum: 0.000000
2023-10-06 10:40:53,650 epoch 3 - iter 45/152 - loss 0.39893368 - time (sec): 32.03 - samples/sec: 286.42 - lr: 0.000137 - momentum: 0.000000
2023-10-06 10:41:04,770 epoch 3 - iter 60/152 - loss 0.40794230 - time (sec): 43.15 - samples/sec: 288.48 - lr: 0.000135 - momentum: 0.000000
2023-10-06 10:41:15,060 epoch 3 - iter 75/152 - loss 0.39913912 - time (sec): 53.44 - samples/sec: 286.34 - lr: 0.000134 - momentum: 0.000000
2023-10-06 10:41:25,555 epoch 3 - iter 90/152 - loss 0.39467611 - time (sec): 63.94 - samples/sec: 286.41 - lr: 0.000132 - momentum: 0.000000
2023-10-06 10:41:36,568 epoch 3 - iter 105/152 - loss 0.38667067 - time (sec): 74.95 - samples/sec: 286.37 - lr: 0.000130 - momentum: 0.000000
2023-10-06 10:41:48,141 epoch 3 - iter 120/152 - loss 0.37563552 - time (sec): 86.52 - samples/sec: 286.24 - lr: 0.000128 - momentum: 0.000000
2023-10-06 10:41:58,950 epoch 3 - iter 135/152 - loss 0.36645295 - time (sec): 97.33 - samples/sec: 284.34 - lr: 0.000127 - momentum: 0.000000
2023-10-06 10:42:09,413 epoch 3 - iter 150/152 - loss 0.35848153 - time (sec): 107.79 - samples/sec: 283.52 - lr: 0.000125 - momentum: 0.000000
2023-10-06 10:42:10,784 ----------------------------------------------------------------------------------------------------
2023-10-06 10:42:10,785 EPOCH 3 done: loss 0.3560 - lr: 0.000125
2023-10-06 10:42:18,617 DEV : loss 0.31865638494491577 - f1-score (micro avg) 0.4764
2023-10-06 10:42:18,625 saving best model
2023-10-06 10:42:19,514 ----------------------------------------------------------------------------------------------------
2023-10-06 10:42:30,303 epoch 4 - iter 15/152 - loss 0.24285578 - time (sec): 10.79 - samples/sec: 282.11 - lr: 0.000123 - momentum: 0.000000
2023-10-06 10:42:40,864 epoch 4 - iter 30/152 - loss 0.26141956 - time (sec): 21.35 - samples/sec: 284.53 - lr: 0.000121 - momentum: 0.000000
2023-10-06 10:42:51,120 epoch 4 - iter 45/152 - loss 0.25055128 - time (sec): 31.60 - samples/sec: 284.71 - lr: 0.000119 - momentum: 0.000000
2023-10-06 10:43:01,399 epoch 4 - iter 60/152 - loss 0.23512877 - time (sec): 41.88 - samples/sec: 287.52 - lr: 0.000118 - momentum: 0.000000
2023-10-06 10:43:12,679 epoch 4 - iter 75/152 - loss 0.23508447 - time (sec): 53.16 - samples/sec: 292.55 - lr: 0.000116 - momentum: 0.000000
2023-10-06 10:43:23,062 epoch 4 - iter 90/152 - loss 0.23015700 - time (sec): 63.55 - samples/sec: 291.92 - lr: 0.000114 - momentum: 0.000000
2023-10-06 10:43:32,983 epoch 4 - iter 105/152 - loss 0.22449265 - time (sec): 73.47 - samples/sec: 292.13 - lr: 0.000112 - momentum: 0.000000
2023-10-06 10:43:43,611 epoch 4 - iter 120/152 - loss 0.21748428 - time (sec): 84.10 - samples/sec: 292.45 - lr: 0.000111 - momentum: 0.000000
2023-10-06 10:43:53,638 epoch 4 - iter 135/152 - loss 0.21356606 - time (sec): 94.12 - samples/sec: 293.17 - lr: 0.000109 - momentum: 0.000000
2023-10-06 10:44:04,141 epoch 4 - iter 150/152 - loss 0.21344885 - time (sec): 104.63 - samples/sec: 293.34 - lr: 0.000107 - momentum: 0.000000
2023-10-06 10:44:05,250 ----------------------------------------------------------------------------------------------------
2023-10-06 10:44:05,251 EPOCH 4 done: loss 0.2129 - lr: 0.000107
2023-10-06 10:44:12,442 DEV : loss 0.2149021476507187 - f1-score (micro avg) 0.6836
2023-10-06 10:44:12,450 saving best model
2023-10-06 10:44:16,790 ----------------------------------------------------------------------------------------------------
2023-10-06 10:44:27,541 epoch 5 - iter 15/152 - loss 0.15301831 - time (sec): 10.75 - samples/sec: 298.07 - lr: 0.000105 - momentum: 0.000000
2023-10-06 10:44:38,109 epoch 5 - iter 30/152 - loss 0.14184898 - time (sec): 21.32 - samples/sec: 294.22 - lr: 0.000104 - momentum: 0.000000
2023-10-06 10:44:48,451 epoch 5 - iter 45/152 - loss 0.15985133 - time (sec): 31.66 - samples/sec: 293.63 - lr: 0.000102 - momentum: 0.000000
2023-10-06 10:44:58,504 epoch 5 - iter 60/152 - loss 0.15135900 - time (sec): 41.71 - samples/sec: 294.18 - lr: 0.000100 - momentum: 0.000000
2023-10-06 10:45:08,146 epoch 5 - iter 75/152 - loss 0.14901894 - time (sec): 51.35 - samples/sec: 291.39 - lr: 0.000098 - momentum: 0.000000
2023-10-06 10:45:19,540 epoch 5 - iter 90/152 - loss 0.15111303 - time (sec): 62.75 - samples/sec: 293.71 - lr: 0.000097 - momentum: 0.000000
2023-10-06 10:45:29,747 epoch 5 - iter 105/152 - loss 0.14954997 - time (sec): 72.96 - samples/sec: 292.96 - lr: 0.000095 - momentum: 0.000000
2023-10-06 10:45:40,847 epoch 5 - iter 120/152 - loss 0.14401373 - time (sec): 84.06 - samples/sec: 292.88 - lr: 0.000093 - momentum: 0.000000
2023-10-06 10:45:51,375 epoch 5 - iter 135/152 - loss 0.14204276 - time (sec): 94.58 - samples/sec: 293.21 - lr: 0.000091 - momentum: 0.000000
2023-10-06 10:46:01,710 epoch 5 - iter 150/152 - loss 0.13868961 - time (sec): 104.92 - samples/sec: 293.11 - lr: 0.000090 - momentum: 0.000000
2023-10-06 10:46:02,711 ----------------------------------------------------------------------------------------------------
2023-10-06 10:46:02,711 EPOCH 5 done: loss 0.1385 - lr: 0.000090
2023-10-06 10:46:10,041 DEV : loss 0.16952356696128845 - f1-score (micro avg) 0.7264
2023-10-06 10:46:10,049 saving best model
2023-10-06 10:46:14,377 ----------------------------------------------------------------------------------------------------
2023-10-06 10:46:25,025 epoch 6 - iter 15/152 - loss 0.10506643 - time (sec): 10.65 - samples/sec: 292.49 - lr: 0.000088 - momentum: 0.000000
2023-10-06 10:46:35,160 epoch 6 - iter 30/152 - loss 0.11538607 - time (sec): 20.78 - samples/sec: 285.49 - lr: 0.000086 - momentum: 0.000000
2023-10-06 10:46:46,117 epoch 6 - iter 45/152 - loss 0.10653796 - time (sec): 31.74 - samples/sec: 286.56 - lr: 0.000084 - momentum: 0.000000
2023-10-06 10:46:57,052 epoch 6 - iter 60/152 - loss 0.10037363 - time (sec): 42.67 - samples/sec: 285.42 - lr: 0.000082 - momentum: 0.000000
2023-10-06 10:47:08,048 epoch 6 - iter 75/152 - loss 0.10875511 - time (sec): 53.67 - samples/sec: 287.35 - lr: 0.000081 - momentum: 0.000000
2023-10-06 10:47:18,387 epoch 6 - iter 90/152 - loss 0.10497693 - time (sec): 64.01 - samples/sec: 288.27 - lr: 0.000079 - momentum: 0.000000
2023-10-06 10:47:28,644 epoch 6 - iter 105/152 - loss 0.10119167 - time (sec): 74.27 - samples/sec: 288.30 - lr: 0.000077 - momentum: 0.000000
2023-10-06 10:47:39,058 epoch 6 - iter 120/152 - loss 0.10230620 - time (sec): 84.68 - samples/sec: 289.24 - lr: 0.000075 - momentum: 0.000000
2023-10-06 10:47:49,899 epoch 6 - iter 135/152 - loss 0.10043280 - time (sec): 95.52 - samples/sec: 290.05 - lr: 0.000074 - momentum: 0.000000
2023-10-06 10:47:59,798 epoch 6 - iter 150/152 - loss 0.09832421 - time (sec): 105.42 - samples/sec: 289.94 - lr: 0.000072 - momentum: 0.000000
2023-10-06 10:48:01,178 ----------------------------------------------------------------------------------------------------
2023-10-06 10:48:01,178 EPOCH 6 done: loss 0.0979 - lr: 0.000072
2023-10-06 10:48:08,349 DEV : loss 0.1445915400981903 - f1-score (micro avg) 0.8051
2023-10-06 10:48:08,359 saving best model
2023-10-06 10:48:12,901 ----------------------------------------------------------------------------------------------------
2023-10-06 10:48:22,907 epoch 7 - iter 15/152 - loss 0.08013514 - time (sec): 10.00 - samples/sec: 281.56 - lr: 0.000070 - momentum: 0.000000
2023-10-06 10:48:33,282 epoch 7 - iter 30/152 - loss 0.08839889 - time (sec): 20.38 - samples/sec: 291.08 - lr: 0.000068 - momentum: 0.000000
2023-10-06 10:48:43,712 epoch 7 - iter 45/152 - loss 0.09550744 - time (sec): 30.81 - samples/sec: 294.45 - lr: 0.000066 - momentum: 0.000000
2023-10-06 10:48:54,487 epoch 7 - iter 60/152 - loss 0.08419124 - time (sec): 41.58 - samples/sec: 297.92 - lr: 0.000065 - momentum: 0.000000
2023-10-06 10:49:05,150 epoch 7 - iter 75/152 - loss 0.08067927 - time (sec): 52.25 - samples/sec: 296.99 - lr: 0.000063 - momentum: 0.000000
2023-10-06 10:49:15,273 epoch 7 - iter 90/152 - loss 0.07739641 - time (sec): 62.37 - samples/sec: 296.13 - lr: 0.000061 - momentum: 0.000000
2023-10-06 10:49:25,152 epoch 7 - iter 105/152 - loss 0.07718528 - time (sec): 72.25 - samples/sec: 295.05 - lr: 0.000059 - momentum: 0.000000
2023-10-06 10:49:35,796 epoch 7 - iter 120/152 - loss 0.07610410 - time (sec): 82.89 - samples/sec: 295.47 - lr: 0.000058 - momentum: 0.000000
2023-10-06 10:49:45,987 epoch 7 - iter 135/152 - loss 0.07588772 - time (sec): 93.08 - samples/sec: 295.27 - lr: 0.000056 - momentum: 0.000000
2023-10-06 10:49:56,537 epoch 7 - iter 150/152 - loss 0.07225452 - time (sec): 103.63 - samples/sec: 296.11 - lr: 0.000054 - momentum: 0.000000
2023-10-06 10:49:57,691 ----------------------------------------------------------------------------------------------------
2023-10-06 10:49:57,692 EPOCH 7 done: loss 0.0729 - lr: 0.000054
2023-10-06 10:50:04,932 DEV : loss 0.1375311315059662 - f1-score (micro avg) 0.814
2023-10-06 10:50:04,944 saving best model
2023-10-06 10:50:09,270 ----------------------------------------------------------------------------------------------------
2023-10-06 10:50:20,418 epoch 8 - iter 15/152 - loss 0.03500955 - time (sec): 11.15 - samples/sec: 299.13 - lr: 0.000052 - momentum: 0.000000
2023-10-06 10:50:31,051 epoch 8 - iter 30/152 - loss 0.05377114 - time (sec): 21.78 - samples/sec: 296.90 - lr: 0.000050 - momentum: 0.000000
2023-10-06 10:50:41,578 epoch 8 - iter 45/152 - loss 0.04797285 - time (sec): 32.31 - samples/sec: 292.36 - lr: 0.000049 - momentum: 0.000000
2023-10-06 10:50:51,943 epoch 8 - iter 60/152 - loss 0.05364760 - time (sec): 42.67 - samples/sec: 289.68 - lr: 0.000047 - momentum: 0.000000
2023-10-06 10:51:02,373 epoch 8 - iter 75/152 - loss 0.05156892 - time (sec): 53.10 - samples/sec: 289.06 - lr: 0.000045 - momentum: 0.000000
2023-10-06 10:51:13,509 epoch 8 - iter 90/152 - loss 0.05741928 - time (sec): 64.24 - samples/sec: 289.91 - lr: 0.000043 - momentum: 0.000000
2023-10-06 10:51:23,299 epoch 8 - iter 105/152 - loss 0.05857020 - time (sec): 74.03 - samples/sec: 285.95 - lr: 0.000042 - momentum: 0.000000
2023-10-06 10:51:34,227 epoch 8 - iter 120/152 - loss 0.05802200 - time (sec): 84.95 - samples/sec: 286.16 - lr: 0.000040 - momentum: 0.000000
2023-10-06 10:51:45,386 epoch 8 - iter 135/152 - loss 0.05733830 - time (sec): 96.11 - samples/sec: 285.91 - lr: 0.000038 - momentum: 0.000000
2023-10-06 10:51:56,488 epoch 8 - iter 150/152 - loss 0.05892501 - time (sec): 107.22 - samples/sec: 285.76 - lr: 0.000036 - momentum: 0.000000
2023-10-06 10:51:57,826 ----------------------------------------------------------------------------------------------------
2023-10-06 10:51:57,827 EPOCH 8 done: loss 0.0585 - lr: 0.000036
2023-10-06 10:52:05,611 DEV : loss 0.13956110179424286 - f1-score (micro avg) 0.8178
2023-10-06 10:52:05,619 saving best model
2023-10-06 10:52:09,954 ----------------------------------------------------------------------------------------------------
2023-10-06 10:52:20,776 epoch 9 - iter 15/152 - loss 0.05536458 - time (sec): 10.82 - samples/sec: 282.69 - lr: 0.000034 - momentum: 0.000000
2023-10-06 10:52:32,130 epoch 9 - iter 30/152 - loss 0.05063668 - time (sec): 22.18 - samples/sec: 279.64 - lr: 0.000033 - momentum: 0.000000
2023-10-06 10:52:43,053 epoch 9 - iter 45/152 - loss 0.05309384 - time (sec): 33.10 - samples/sec: 279.59 - lr: 0.000031 - momentum: 0.000000
2023-10-06 10:52:54,355 epoch 9 - iter 60/152 - loss 0.05357927 - time (sec): 44.40 - samples/sec: 279.64 - lr: 0.000029 - momentum: 0.000000
2023-10-06 10:53:04,615 epoch 9 - iter 75/152 - loss 0.04918157 - time (sec): 54.66 - samples/sec: 277.15 - lr: 0.000027 - momentum: 0.000000
2023-10-06 10:53:16,113 epoch 9 - iter 90/152 - loss 0.05066183 - time (sec): 66.16 - samples/sec: 276.90 - lr: 0.000026 - momentum: 0.000000
2023-10-06 10:53:27,065 epoch 9 - iter 105/152 - loss 0.04845656 - time (sec): 77.11 - samples/sec: 276.44 - lr: 0.000024 - momentum: 0.000000
2023-10-06 10:53:38,519 epoch 9 - iter 120/152 - loss 0.04875877 - time (sec): 88.56 - samples/sec: 275.97 - lr: 0.000022 - momentum: 0.000000
2023-10-06 10:53:49,695 epoch 9 - iter 135/152 - loss 0.05165390 - time (sec): 99.74 - samples/sec: 275.34 - lr: 0.000020 - momentum: 0.000000
2023-10-06 10:54:01,351 epoch 9 - iter 150/152 - loss 0.04922872 - time (sec): 111.40 - samples/sec: 274.50 - lr: 0.000019 - momentum: 0.000000
2023-10-06 10:54:02,798 ----------------------------------------------------------------------------------------------------
2023-10-06 10:54:02,798 EPOCH 9 done: loss 0.0488 - lr: 0.000019
2023-10-06 10:54:10,475 DEV : loss 0.1437169760465622 - f1-score (micro avg) 0.8098
2023-10-06 10:54:10,483 ----------------------------------------------------------------------------------------------------
2023-10-06 10:54:20,789 epoch 10 - iter 15/152 - loss 0.08375301 - time (sec): 10.30 - samples/sec: 282.70 - lr: 0.000017 - momentum: 0.000000
2023-10-06 10:54:30,604 epoch 10 - iter 30/152 - loss 0.05636889 - time (sec): 20.12 - samples/sec: 278.78 - lr: 0.000015 - momentum: 0.000000
2023-10-06 10:54:41,888 epoch 10 - iter 45/152 - loss 0.04796998 - time (sec): 31.40 - samples/sec: 284.24 - lr: 0.000013 - momentum: 0.000000
2023-10-06 10:54:52,723 epoch 10 - iter 60/152 - loss 0.04742423 - time (sec): 42.24 - samples/sec: 287.91 - lr: 0.000012 - momentum: 0.000000
2023-10-06 10:55:03,255 epoch 10 - iter 75/152 - loss 0.04356268 - time (sec): 52.77 - samples/sec: 289.08 - lr: 0.000010 - momentum: 0.000000
2023-10-06 10:55:13,771 epoch 10 - iter 90/152 - loss 0.04465371 - time (sec): 63.29 - samples/sec: 290.72 - lr: 0.000008 - momentum: 0.000000
2023-10-06 10:55:24,658 epoch 10 - iter 105/152 - loss 0.04590660 - time (sec): 74.17 - samples/sec: 291.91 - lr: 0.000006 - momentum: 0.000000
2023-10-06 10:55:34,986 epoch 10 - iter 120/152 - loss 0.04507310 - time (sec): 84.50 - samples/sec: 291.39 - lr: 0.000005 - momentum: 0.000000
2023-10-06 10:55:45,232 epoch 10 - iter 135/152 - loss 0.04686558 - time (sec): 94.75 - samples/sec: 292.19 - lr: 0.000003 - momentum: 0.000000
2023-10-06 10:55:55,427 epoch 10 - iter 150/152 - loss 0.04543216 - time (sec): 104.94 - samples/sec: 291.99 - lr: 0.000001 - momentum: 0.000000
2023-10-06 10:55:56,609 ----------------------------------------------------------------------------------------------------
2023-10-06 10:55:56,609 EPOCH 10 done: loss 0.0452 - lr: 0.000001
2023-10-06 10:56:03,637 DEV : loss 0.14325198531150818 - f1-score (micro avg) 0.8163
2023-10-06 10:56:04,493 ----------------------------------------------------------------------------------------------------
2023-10-06 10:56:04,494 Loading model from best epoch ...
2023-10-06 10:56:07,099 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-06 10:56:13,878
Results:
- F-score (micro) 0.7946
- F-score (macro) 0.4833
- Accuracy 0.6697
By class:
precision recall f1-score support
scope 0.7469 0.8013 0.7732 151
pers 0.7377 0.9375 0.8257 96
work 0.7685 0.8737 0.8177 95
loc 0.0000 0.0000 0.0000 3
date 0.0000 0.0000 0.0000 3
micro avg 0.7500 0.8448 0.7946 348
macro avg 0.4506 0.5225 0.4833 348
weighted avg 0.7374 0.8448 0.7865 348
2023-10-06 10:56:13,878 ----------------------------------------------------------------------------------------------------