t5-summarization-one-shot-better-prompt
This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.3104
- Rouge: {'rouge1': 43.2304, 'rouge2': 20.0106, 'rougeL': 19.6358, 'rougeLsum': 19.6358}
- Bert Score: 0.8771
- Bleurt 20: -0.8213
- Gen Len: 13.765
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 7
- eval_batch_size: 7
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge | Bert Score | Bleurt 20 | Gen Len |
---|---|---|---|---|---|---|---|
2.6246 | 1.0 | 172 | 2.4757 | {'rouge1': 44.9153, 'rouge2': 18.5718, 'rougeL': 19.4145, 'rougeLsum': 19.4145} | 0.8727 | -0.8454 | 14.15 |
2.4225 | 2.0 | 344 | 2.3819 | {'rouge1': 43.3173, 'rouge2': 18.3383, 'rougeL': 18.49, 'rougeLsum': 18.49} | 0.87 | -0.8976 | 13.895 |
2.2187 | 3.0 | 516 | 2.3301 | {'rouge1': 40.6109, 'rouge2': 16.9766, 'rougeL': 18.1673, 'rougeLsum': 18.1673} | 0.8703 | -0.9236 | 13.47 |
2.1314 | 4.0 | 688 | 2.2950 | {'rouge1': 43.6761, 'rouge2': 18.7232, 'rougeL': 18.9003, 'rougeLsum': 18.9003} | 0.8726 | -0.8971 | 14.095 |
1.9971 | 5.0 | 860 | 2.2897 | {'rouge1': 42.3985, 'rouge2': 18.5244, 'rougeL': 18.5991, 'rougeLsum': 18.5991} | 0.8719 | -0.9075 | 13.585 |
1.9511 | 6.0 | 1032 | 2.2926 | {'rouge1': 42.4732, 'rouge2': 18.4346, 'rougeL': 19.0653, 'rougeLsum': 19.0653} | 0.8767 | -0.8847 | 13.45 |
1.8624 | 7.0 | 1204 | 2.2883 | {'rouge1': 41.6345, 'rouge2': 18.3464, 'rougeL': 19.0603, 'rougeLsum': 19.0603} | 0.8767 | -0.8899 | 13.21 |
1.8361 | 8.0 | 1376 | 2.2816 | {'rouge1': 44.2794, 'rouge2': 19.6559, 'rougeL': 19.7029, 'rougeLsum': 19.7029} | 0.8767 | -0.8561 | 14.045 |
1.7869 | 9.0 | 1548 | 2.2807 | {'rouge1': 43.4703, 'rouge2': 18.9051, 'rougeL': 19.5826, 'rougeLsum': 19.5826} | 0.8781 | -0.8442 | 13.82 |
1.7048 | 10.0 | 1720 | 2.2624 | {'rouge1': 42.8553, 'rouge2': 18.5641, 'rougeL': 19.3523, 'rougeLsum': 19.3523} | 0.8769 | -0.8679 | 13.665 |
1.6718 | 11.0 | 1892 | 2.2757 | {'rouge1': 43.1539, 'rouge2': 19.3463, 'rougeL': 19.8968, 'rougeLsum': 19.8968} | 0.8783 | -0.8662 | 13.61 |
1.6599 | 12.0 | 2064 | 2.2909 | {'rouge1': 43.5918, 'rouge2': 19.7046, 'rougeL': 19.673, 'rougeLsum': 19.673} | 0.8767 | -0.8587 | 13.79 |
1.5895 | 13.0 | 2236 | 2.2927 | {'rouge1': 43.787, 'rouge2': 20.2026, 'rougeL': 19.8817, 'rougeLsum': 19.8817} | 0.878 | -0.8296 | 13.95 |
1.5527 | 14.0 | 2408 | 2.2987 | {'rouge1': 43.3703, 'rouge2': 19.6301, 'rougeL': 19.8362, 'rougeLsum': 19.8362} | 0.8776 | -0.8473 | 13.635 |
1.5892 | 15.0 | 2580 | 2.2942 | {'rouge1': 43.3245, 'rouge2': 19.7414, 'rougeL': 19.7625, 'rougeLsum': 19.7625} | 0.8772 | -0.8295 | 13.805 |
1.5273 | 16.0 | 2752 | 2.3007 | {'rouge1': 43.0611, 'rouge2': 19.8572, 'rougeL': 19.7198, 'rougeLsum': 19.7198} | 0.877 | -0.8293 | 13.795 |
1.5546 | 17.0 | 2924 | 2.3019 | {'rouge1': 43.1098, 'rouge2': 19.699, 'rougeL': 19.8258, 'rougeLsum': 19.8258} | 0.8763 | -0.841 | 13.785 |
1.5363 | 18.0 | 3096 | 2.3070 | {'rouge1': 43.9332, 'rouge2': 20.1972, 'rougeL': 19.8764, 'rougeLsum': 19.8764} | 0.878 | -0.8167 | 13.86 |
1.5118 | 19.0 | 3268 | 2.3105 | {'rouge1': 43.2268, 'rouge2': 20.074, 'rougeL': 19.831, 'rougeLsum': 19.831} | 0.8778 | -0.8134 | 13.78 |
1.502 | 20.0 | 3440 | 2.3104 | {'rouge1': 43.2304, 'rouge2': 20.0106, 'rougeL': 19.6358, 'rougeLsum': 19.6358} | 0.8771 | -0.8213 | 13.765 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0
- Downloads last month
- 4
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for sarahahtee/t5-summarization-one-shot-better-prompt
Base model
google/flan-t5-small