|
2023-10-13 01:11:03,794 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:11:03,796 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 01:11:03,796 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:11:03,796 MultiCorpus: 7936 train + 992 dev + 992 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr |
|
2023-10-13 01:11:03,797 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:11:03,797 Train: 7936 sentences |
|
2023-10-13 01:11:03,797 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 01:11:03,797 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:11:03,797 Training Params: |
|
2023-10-13 01:11:03,797 - learning_rate: "0.00016" |
|
2023-10-13 01:11:03,797 - mini_batch_size: "4" |
|
2023-10-13 01:11:03,797 - max_epochs: "10" |
|
2023-10-13 01:11:03,797 - shuffle: "True" |
|
2023-10-13 01:11:03,797 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:11:03,797 Plugins: |
|
2023-10-13 01:11:03,797 - TensorboardLogger |
|
2023-10-13 01:11:03,797 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 01:11:03,797 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:11:03,797 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 01:11:03,798 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 01:11:03,798 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:11:03,798 Computation: |
|
2023-10-13 01:11:03,798 - compute on device: cuda:0 |
|
2023-10-13 01:11:03,798 - embedding storage: none |
|
2023-10-13 01:11:03,798 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:11:03,798 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-13 01:11:03,798 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:11:03,798 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:11:03,798 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-13 01:11:57,074 epoch 1 - iter 198/1984 - loss 2.52843731 - time (sec): 53.27 - samples/sec: 301.54 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 01:12:49,896 epoch 1 - iter 396/1984 - loss 2.33063285 - time (sec): 106.10 - samples/sec: 299.47 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 01:13:43,828 epoch 1 - iter 594/1984 - loss 2.00558938 - time (sec): 160.03 - samples/sec: 304.60 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 01:14:37,623 epoch 1 - iter 792/1984 - loss 1.68962898 - time (sec): 213.82 - samples/sec: 305.19 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-13 01:15:32,254 epoch 1 - iter 990/1984 - loss 1.42576251 - time (sec): 268.45 - samples/sec: 305.85 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 01:16:30,374 epoch 1 - iter 1188/1984 - loss 1.22765584 - time (sec): 326.57 - samples/sec: 301.16 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-13 01:17:25,376 epoch 1 - iter 1386/1984 - loss 1.09207250 - time (sec): 381.58 - samples/sec: 298.82 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-13 01:18:19,155 epoch 1 - iter 1584/1984 - loss 0.98238438 - time (sec): 435.36 - samples/sec: 299.79 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-13 01:19:13,147 epoch 1 - iter 1782/1984 - loss 0.89132216 - time (sec): 489.35 - samples/sec: 301.35 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-13 01:20:07,717 epoch 1 - iter 1980/1984 - loss 0.81919035 - time (sec): 543.92 - samples/sec: 300.96 - lr: 0.000160 - momentum: 0.000000 |
|
2023-10-13 01:20:08,902 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:20:08,902 EPOCH 1 done: loss 0.8182 - lr: 0.000160 |
|
2023-10-13 01:20:33,569 DEV : loss 0.13464027643203735 - f1-score (micro avg) 0.6822 |
|
2023-10-13 01:20:33,608 saving best model |
|
2023-10-13 01:20:34,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:21:27,518 epoch 2 - iter 198/1984 - loss 0.13398518 - time (sec): 52.97 - samples/sec: 317.95 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-13 01:22:20,320 epoch 2 - iter 396/1984 - loss 0.13653987 - time (sec): 105.77 - samples/sec: 312.54 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-13 01:23:13,784 epoch 2 - iter 594/1984 - loss 0.13426706 - time (sec): 159.23 - samples/sec: 309.94 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-13 01:24:07,847 epoch 2 - iter 792/1984 - loss 0.12961703 - time (sec): 213.30 - samples/sec: 310.15 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-13 01:25:00,723 epoch 2 - iter 990/1984 - loss 0.12845842 - time (sec): 266.17 - samples/sec: 308.58 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-13 01:25:57,368 epoch 2 - iter 1188/1984 - loss 0.12621724 - time (sec): 322.82 - samples/sec: 306.90 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-13 01:26:53,050 epoch 2 - iter 1386/1984 - loss 0.12382592 - time (sec): 378.50 - samples/sec: 306.09 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-13 01:27:47,084 epoch 2 - iter 1584/1984 - loss 0.12243932 - time (sec): 432.53 - samples/sec: 304.10 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-13 01:28:40,225 epoch 2 - iter 1782/1984 - loss 0.12044255 - time (sec): 485.67 - samples/sec: 303.81 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-13 01:29:33,733 epoch 2 - iter 1980/1984 - loss 0.11911028 - time (sec): 539.18 - samples/sec: 303.69 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-13 01:29:34,760 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:29:34,760 EPOCH 2 done: loss 0.1191 - lr: 0.000142 |
|
2023-10-13 01:30:00,644 DEV : loss 0.09056346118450165 - f1-score (micro avg) 0.7464 |
|
2023-10-13 01:30:00,691 saving best model |
|
2023-10-13 01:30:03,443 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:30:58,918 epoch 3 - iter 198/1984 - loss 0.06739931 - time (sec): 55.47 - samples/sec: 309.34 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-13 01:31:51,844 epoch 3 - iter 396/1984 - loss 0.07425965 - time (sec): 108.40 - samples/sec: 306.04 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-13 01:32:45,346 epoch 3 - iter 594/1984 - loss 0.07267235 - time (sec): 161.90 - samples/sec: 302.08 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-13 01:33:40,251 epoch 3 - iter 792/1984 - loss 0.07060065 - time (sec): 216.80 - samples/sec: 298.37 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-13 01:34:34,939 epoch 3 - iter 990/1984 - loss 0.07092729 - time (sec): 271.49 - samples/sec: 299.52 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-13 01:35:31,126 epoch 3 - iter 1188/1984 - loss 0.07276706 - time (sec): 327.68 - samples/sec: 297.65 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-13 01:36:27,297 epoch 3 - iter 1386/1984 - loss 0.07298504 - time (sec): 383.85 - samples/sec: 295.50 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-13 01:37:24,493 epoch 3 - iter 1584/1984 - loss 0.07331926 - time (sec): 441.05 - samples/sec: 294.46 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-13 01:38:20,383 epoch 3 - iter 1782/1984 - loss 0.07273039 - time (sec): 496.94 - samples/sec: 294.70 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-13 01:39:13,609 epoch 3 - iter 1980/1984 - loss 0.07386661 - time (sec): 550.16 - samples/sec: 297.68 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-13 01:39:14,597 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:39:14,597 EPOCH 3 done: loss 0.0738 - lr: 0.000125 |
|
2023-10-13 01:39:39,918 DEV : loss 0.10628753900527954 - f1-score (micro avg) 0.7431 |
|
2023-10-13 01:39:39,957 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:40:34,507 epoch 4 - iter 198/1984 - loss 0.05002684 - time (sec): 54.55 - samples/sec: 319.28 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-13 01:41:28,212 epoch 4 - iter 396/1984 - loss 0.04642258 - time (sec): 108.25 - samples/sec: 306.78 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-13 01:42:24,209 epoch 4 - iter 594/1984 - loss 0.04922220 - time (sec): 164.25 - samples/sec: 300.82 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-13 01:43:20,372 epoch 4 - iter 792/1984 - loss 0.04903358 - time (sec): 220.41 - samples/sec: 298.20 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-13 01:44:16,017 epoch 4 - iter 990/1984 - loss 0.05035239 - time (sec): 276.06 - samples/sec: 297.50 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-13 01:45:10,220 epoch 4 - iter 1188/1984 - loss 0.05166238 - time (sec): 330.26 - samples/sec: 297.17 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-13 01:46:04,647 epoch 4 - iter 1386/1984 - loss 0.05299053 - time (sec): 384.69 - samples/sec: 295.60 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-13 01:46:57,739 epoch 4 - iter 1584/1984 - loss 0.05360553 - time (sec): 437.78 - samples/sec: 298.38 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-13 01:47:51,321 epoch 4 - iter 1782/1984 - loss 0.05404042 - time (sec): 491.36 - samples/sec: 298.68 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-13 01:48:46,615 epoch 4 - iter 1980/1984 - loss 0.05469939 - time (sec): 546.66 - samples/sec: 299.43 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-13 01:48:47,680 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:48:47,680 EPOCH 4 done: loss 0.0547 - lr: 0.000107 |
|
2023-10-13 01:49:13,494 DEV : loss 0.12986594438552856 - f1-score (micro avg) 0.7629 |
|
2023-10-13 01:49:13,542 saving best model |
|
2023-10-13 01:49:16,237 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:50:11,842 epoch 5 - iter 198/1984 - loss 0.03937109 - time (sec): 55.60 - samples/sec: 279.25 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-13 01:51:04,893 epoch 5 - iter 396/1984 - loss 0.03149180 - time (sec): 108.65 - samples/sec: 293.70 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-13 01:51:57,276 epoch 5 - iter 594/1984 - loss 0.03526855 - time (sec): 161.03 - samples/sec: 300.86 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-13 01:52:48,839 epoch 5 - iter 792/1984 - loss 0.04002136 - time (sec): 212.59 - samples/sec: 299.54 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-13 01:53:41,046 epoch 5 - iter 990/1984 - loss 0.04019836 - time (sec): 264.80 - samples/sec: 299.44 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-13 01:54:33,468 epoch 5 - iter 1188/1984 - loss 0.03965078 - time (sec): 317.22 - samples/sec: 302.34 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-13 01:55:27,904 epoch 5 - iter 1386/1984 - loss 0.03976973 - time (sec): 371.66 - samples/sec: 306.05 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-13 01:56:21,617 epoch 5 - iter 1584/1984 - loss 0.03952830 - time (sec): 425.37 - samples/sec: 306.42 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-13 01:57:14,389 epoch 5 - iter 1782/1984 - loss 0.04052637 - time (sec): 478.14 - samples/sec: 306.45 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-13 01:58:06,695 epoch 5 - iter 1980/1984 - loss 0.04037425 - time (sec): 530.45 - samples/sec: 308.53 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-13 01:58:07,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:58:07,772 EPOCH 5 done: loss 0.0403 - lr: 0.000089 |
|
2023-10-13 01:58:31,953 DEV : loss 0.15890514850616455 - f1-score (micro avg) 0.7578 |
|
2023-10-13 01:58:31,995 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 01:59:24,863 epoch 6 - iter 198/1984 - loss 0.03383294 - time (sec): 52.87 - samples/sec: 307.40 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-13 02:00:17,092 epoch 6 - iter 396/1984 - loss 0.03126055 - time (sec): 105.09 - samples/sec: 306.52 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-13 02:01:09,827 epoch 6 - iter 594/1984 - loss 0.03079949 - time (sec): 157.83 - samples/sec: 309.23 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-13 02:02:02,146 epoch 6 - iter 792/1984 - loss 0.03162031 - time (sec): 210.15 - samples/sec: 311.53 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-13 02:02:54,554 epoch 6 - iter 990/1984 - loss 0.03146134 - time (sec): 262.56 - samples/sec: 312.79 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 02:03:46,567 epoch 6 - iter 1188/1984 - loss 0.03128278 - time (sec): 314.57 - samples/sec: 311.49 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-13 02:04:39,641 epoch 6 - iter 1386/1984 - loss 0.03149198 - time (sec): 367.64 - samples/sec: 313.24 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-13 02:05:33,650 epoch 6 - iter 1584/1984 - loss 0.02992630 - time (sec): 421.65 - samples/sec: 312.39 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 02:06:26,646 epoch 6 - iter 1782/1984 - loss 0.02996741 - time (sec): 474.65 - samples/sec: 312.51 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-13 02:07:18,444 epoch 6 - iter 1980/1984 - loss 0.02958364 - time (sec): 526.45 - samples/sec: 311.08 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-13 02:07:19,461 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 02:07:19,461 EPOCH 6 done: loss 0.0296 - lr: 0.000071 |
|
2023-10-13 02:07:43,565 DEV : loss 0.16922588646411896 - f1-score (micro avg) 0.7613 |
|
2023-10-13 02:07:43,608 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 02:08:34,306 epoch 7 - iter 198/1984 - loss 0.01507566 - time (sec): 50.70 - samples/sec: 318.70 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-13 02:09:25,504 epoch 7 - iter 396/1984 - loss 0.01573958 - time (sec): 101.89 - samples/sec: 323.20 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-13 02:10:18,310 epoch 7 - iter 594/1984 - loss 0.01664591 - time (sec): 154.70 - samples/sec: 316.50 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-13 02:11:10,262 epoch 7 - iter 792/1984 - loss 0.01785772 - time (sec): 206.65 - samples/sec: 316.65 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-13 02:12:01,198 epoch 7 - iter 990/1984 - loss 0.01876625 - time (sec): 257.59 - samples/sec: 316.49 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-13 02:12:52,573 epoch 7 - iter 1188/1984 - loss 0.02003134 - time (sec): 308.96 - samples/sec: 318.37 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-13 02:13:43,862 epoch 7 - iter 1386/1984 - loss 0.01967015 - time (sec): 360.25 - samples/sec: 318.42 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-13 02:14:37,995 epoch 7 - iter 1584/1984 - loss 0.02069974 - time (sec): 414.38 - samples/sec: 316.77 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-13 02:15:35,237 epoch 7 - iter 1782/1984 - loss 0.02041747 - time (sec): 471.63 - samples/sec: 312.32 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-13 02:16:29,103 epoch 7 - iter 1980/1984 - loss 0.02160503 - time (sec): 525.49 - samples/sec: 311.51 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-13 02:16:30,210 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 02:16:30,210 EPOCH 7 done: loss 0.0217 - lr: 0.000053 |
|
2023-10-13 02:16:54,770 DEV : loss 0.19541531801223755 - f1-score (micro avg) 0.7587 |
|
2023-10-13 02:16:54,810 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 02:17:51,202 epoch 8 - iter 198/1984 - loss 0.01913314 - time (sec): 56.39 - samples/sec: 293.35 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-13 02:18:47,470 epoch 8 - iter 396/1984 - loss 0.01758474 - time (sec): 112.66 - samples/sec: 283.41 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 02:19:44,479 epoch 8 - iter 594/1984 - loss 0.01501596 - time (sec): 169.67 - samples/sec: 287.41 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 02:20:38,244 epoch 8 - iter 792/1984 - loss 0.01524608 - time (sec): 223.43 - samples/sec: 293.67 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 02:21:28,379 epoch 8 - iter 990/1984 - loss 0.01572225 - time (sec): 273.57 - samples/sec: 299.81 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 02:22:18,130 epoch 8 - iter 1188/1984 - loss 0.01543476 - time (sec): 323.32 - samples/sec: 303.46 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 02:23:07,803 epoch 8 - iter 1386/1984 - loss 0.01514254 - time (sec): 372.99 - samples/sec: 306.00 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 02:23:58,125 epoch 8 - iter 1584/1984 - loss 0.01446280 - time (sec): 423.31 - samples/sec: 309.49 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 02:24:47,909 epoch 8 - iter 1782/1984 - loss 0.01453879 - time (sec): 473.10 - samples/sec: 310.57 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 02:25:39,611 epoch 8 - iter 1980/1984 - loss 0.01586744 - time (sec): 524.80 - samples/sec: 311.79 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 02:25:40,640 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 02:25:40,641 EPOCH 8 done: loss 0.0159 - lr: 0.000036 |
|
2023-10-13 02:26:07,441 DEV : loss 0.20716699957847595 - f1-score (micro avg) 0.7565 |
|
2023-10-13 02:26:07,484 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 02:27:01,686 epoch 9 - iter 198/1984 - loss 0.00740229 - time (sec): 54.20 - samples/sec: 285.24 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 02:27:55,342 epoch 9 - iter 396/1984 - loss 0.00741304 - time (sec): 107.86 - samples/sec: 286.26 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 02:28:48,271 epoch 9 - iter 594/1984 - loss 0.00953282 - time (sec): 160.78 - samples/sec: 293.98 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 02:29:41,888 epoch 9 - iter 792/1984 - loss 0.00987460 - time (sec): 214.40 - samples/sec: 299.90 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 02:30:34,733 epoch 9 - iter 990/1984 - loss 0.00972689 - time (sec): 267.25 - samples/sec: 302.63 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 02:31:26,785 epoch 9 - iter 1188/1984 - loss 0.00991641 - time (sec): 319.30 - samples/sec: 308.78 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 02:32:18,304 epoch 9 - iter 1386/1984 - loss 0.01028782 - time (sec): 370.82 - samples/sec: 311.33 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 02:33:10,088 epoch 9 - iter 1584/1984 - loss 0.01070081 - time (sec): 422.60 - samples/sec: 312.62 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 02:34:02,588 epoch 9 - iter 1782/1984 - loss 0.01055651 - time (sec): 475.10 - samples/sec: 312.75 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 02:34:54,501 epoch 9 - iter 1980/1984 - loss 0.01028953 - time (sec): 527.01 - samples/sec: 310.49 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 02:34:55,569 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 02:34:55,570 EPOCH 9 done: loss 0.0103 - lr: 0.000018 |
|
2023-10-13 02:35:21,677 DEV : loss 0.2197086364030838 - f1-score (micro avg) 0.7675 |
|
2023-10-13 02:35:21,722 saving best model |
|
2023-10-13 02:35:24,408 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 02:36:16,693 epoch 10 - iter 198/1984 - loss 0.00313690 - time (sec): 52.28 - samples/sec: 315.59 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 02:37:09,099 epoch 10 - iter 396/1984 - loss 0.00558899 - time (sec): 104.69 - samples/sec: 317.42 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 02:38:01,060 epoch 10 - iter 594/1984 - loss 0.00556445 - time (sec): 156.65 - samples/sec: 319.86 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 02:38:53,915 epoch 10 - iter 792/1984 - loss 0.00552369 - time (sec): 209.50 - samples/sec: 315.53 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 02:39:46,052 epoch 10 - iter 990/1984 - loss 0.00607246 - time (sec): 261.64 - samples/sec: 315.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 02:40:39,710 epoch 10 - iter 1188/1984 - loss 0.00683260 - time (sec): 315.30 - samples/sec: 311.80 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 02:41:32,898 epoch 10 - iter 1386/1984 - loss 0.00736140 - time (sec): 368.49 - samples/sec: 312.14 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 02:42:26,307 epoch 10 - iter 1584/1984 - loss 0.00744783 - time (sec): 421.89 - samples/sec: 310.11 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 02:43:19,462 epoch 10 - iter 1782/1984 - loss 0.00740743 - time (sec): 475.05 - samples/sec: 311.13 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 02:44:11,558 epoch 10 - iter 1980/1984 - loss 0.00755919 - time (sec): 527.14 - samples/sec: 310.50 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 02:44:12,617 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 02:44:12,618 EPOCH 10 done: loss 0.0076 - lr: 0.000000 |
|
2023-10-13 02:44:38,438 DEV : loss 0.22709771990776062 - f1-score (micro avg) 0.769 |
|
2023-10-13 02:44:38,479 saving best model |
|
2023-10-13 02:44:42,054 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 02:44:42,057 Loading model from best epoch ... |
|
2023-10-13 02:44:47,171 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-13 02:45:11,248 |
|
Results: |
|
- F-score (micro) 0.7627 |
|
- F-score (macro) 0.6811 |
|
- Accuracy 0.6449 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7997 0.8351 0.8170 655 |
|
PER 0.6965 0.8027 0.7458 223 |
|
ORG 0.5392 0.4331 0.4803 127 |
|
|
|
micro avg 0.7488 0.7771 0.7627 1005 |
|
macro avg 0.6785 0.6903 0.6811 1005 |
|
weighted avg 0.7439 0.7771 0.7587 1005 |
|
|
|
2023-10-13 02:45:11,248 ---------------------------------------------------------------------------------------------------- |
|
|