litbank-coref-mem-small

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 30

Training Loss	Epoch	Step	Validation Loss
No log	1.0	82	2.7534
7.7417	2.0	164	1.4656
2.1731	3.0	246	1.1132
1.4758	4.0	328	0.8573
1.1465	5.0	410	0.6490
1.1465	6.0	492	0.4874
0.8867	7.0	574	0.3676
0.6824	8.0	656	0.2896
0.5269	9.0	738	0.2404
0.4199	10.0	820	0.2039
0.3455	11.0	902	0.1803
0.3455	12.0	984	0.1615
0.2939	13.0	1066	0.1490
0.2537	14.0	1148	0.1402
0.2272	15.0	1230	0.1320
0.2143	16.0	1312	0.1279
0.2143	17.0	1394	0.1232
0.2022	18.0	1476	0.1206
0.1956	19.0	1558	0.1190
0.1895	20.0	1640	0.1161
0.1847	21.0	1722	0.1143
0.181	22.0	1804	0.1127
0.181	23.0	1886	0.1121
0.1786	24.0	1968	0.1110
0.175	25.0	2050	0.1099
0.1718	26.0	2132	0.1094
0.1724	27.0	2214	0.1090
0.1724	28.0	2296	0.1087
0.1718	29.0	2378	0.1086
0.1704	30.0	2460	0.1085