deductor-flant5-large
This model is a fine-tuned version of google/flan-t5-large on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.2461
- Rouge1: 92.1213
- Rouge2: 86.4281
- Rougel: 90.5846
- Rougelsum: 90.5294
- Gen Len: 11.2014
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.306 | 0.19 | 50 | 0.2959 | 89.3028 | 82.5127 | 87.4173 | 87.3544 | 11.2211 |
0.2774 | 0.38 | 100 | 0.2717 | 90.8414 | 84.2378 | 88.9385 | 88.9058 | 11.2571 |
0.2366 | 0.57 | 150 | 0.2613 | 91.0152 | 84.6687 | 89.2107 | 89.1735 | 11.2081 |
0.2166 | 0.77 | 200 | 0.2585 | 91.5215 | 85.4308 | 89.7742 | 89.7422 | 11.2802 |
0.22 | 0.96 | 250 | 0.2517 | 91.5587 | 85.6107 | 89.8835 | 89.8621 | 11.2655 |
0.1564 | 1.15 | 300 | 0.2630 | 91.999 | 86.0835 | 90.3611 | 90.3168 | 11.2039 |
0.1803 | 1.34 | 350 | 0.2546 | 91.5183 | 85.6214 | 89.9752 | 89.9323 | 11.2462 |
0.1737 | 1.53 | 400 | 0.2483 | 91.8342 | 86.0171 | 90.3042 | 90.2641 | 11.1943 |
0.157 | 1.72 | 450 | 0.2493 | 91.6585 | 85.4651 | 90.0181 | 89.9991 | 10.9376 |
0.1561 | 1.92 | 500 | 0.2461 | 92.1213 | 86.4281 | 90.5846 | 90.5294 | 11.2014 |
0.1191 | 2.11 | 550 | 0.2585 | 92.4493 | 86.6961 | 90.9293 | 90.8761 | 11.2416 |
0.1134 | 2.3 | 600 | 0.2633 | 92.4707 | 86.833 | 90.9516 | 90.9195 | 11.1675 |
0.1227 | 2.49 | 650 | 0.2592 | 92.2738 | 86.5064 | 90.7556 | 90.6998 | 11.2642 |
0.1175 | 2.68 | 700 | 0.2657 | 92.0861 | 86.2203 | 90.6168 | 90.5657 | 11.1700 |
0.1132 | 2.87 | 750 | 0.2644 | 92.3834 | 86.7237 | 90.8761 | 90.8389 | 11.2123 |
0.1097 | 3.07 | 800 | 0.2692 | 92.3356 | 86.7021 | 90.8717 | 90.8185 | 11.1822 |
0.0949 | 3.26 | 850 | 0.2690 | 92.5746 | 87.001 | 91.1734 | 91.1222 | 11.2785 |
0.0813 | 3.45 | 900 | 0.2875 | 92.5641 | 86.9813 | 91.0881 | 91.0411 | 11.2257 |
0.0861 | 3.64 | 950 | 0.2800 | 92.4738 | 86.9379 | 91.0384 | 90.9995 | 11.2136 |
0.0879 | 3.83 | 1000 | 0.2770 | 92.6025 | 87.105 | 91.1632 | 91.1292 | 11.2303 |
Framework versions
- Transformers 4.36.2
- Pytorch 2.0.1
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- 119
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for navidmadani/deductor_flant5_large_v1.0
Base model
google/flan-t5-large