flan-t5-base

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 20

Training Loss	Epoch	Step	Validation Loss
No log	1.0	7	4.4814
14.0527	2.0	14	2.8827
2.9577	3.0	21	0.9385
2.9577	4.0	28	0.7474
1.1968	5.0	35	0.6469
0.856	6.0	42	0.6204
0.856	7.0	49	0.5960
0.8294	8.0	56	0.5871
0.7195	9.0	63	0.5555
0.6591	10.0	70	0.5468
0.6591	11.0	77	0.5375
0.6328	12.0	84	0.5498
0.5922	13.0	91	0.5306
0.5922	14.0	98	0.5075
0.5789	15.0	105	0.4910
0.5288	16.0	112	0.4886
0.5288	17.0	119	0.4802
0.5109	18.0	126	0.4869
0.5001	19.0	133	0.4834
0.4811	20.0	140	0.4829