BART-LARGE-DIALOGSUM

This model is a fine-tuned version of ainize/bart-base-cnn on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 5
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.5752	1.0	779	1.3340	44.5563	19.2131	36.7114	39.6611	50.176
1.3484	2.0	1558	1.2787	45.6688	20.9682	38.0344	40.8801	49.748
1.3058	3.0	2337	1.2614	45.9742	21.0722	38.2515	41.207	43.842
1.2514	4.0	3116	1.2537	46.0688	21.2466	38.5075	41.3072	45.766
1.2278	5.0	3895	1.2520	45.9023	21.1512	38.0547	41.0074	53.296