mqy
/

mt5-small-text-sum-1

+---
+license: apache-2.0
+tags:
+- summarization
+- generated_from_trainer
+metrics:
+- rouge
+model-index:
+- name: mt5-small-text-sum-1
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mt5-small-text-sum-1
+This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.3715
+- Rouge1: 20.75
+- Rouge2: 6.54
+- Rougel: 20.33
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 40
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Rouge1 | Rouge2 | Rougel |
+|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|
+| 4.936         | 1.29  | 500   | 2.6226          | 15.38  | 5.14   | 15.22  |
+| 3.1573        | 2.58  | 1000  | 2.5081          | 18.02  | 5.53   | 17.8   |
+| 2.9258        | 3.87  | 1500  | 2.4499          | 17.19  | 5.3    | 17.0   |
+| 2.7786        | 5.15  | 2000  | 2.4264          | 18.17  | 5.02   | 17.99  |
+| 2.6786        | 6.44  | 2500  | 2.4088          | 17.98  | 5.48   | 17.6   |
+| 2.5824        | 7.73  | 3000  | 2.3909          | 19.43  | 6.32   | 19.07  |
+| 2.5261        | 9.02  | 3500  | 2.3691          | 19.06  | 5.94   | 18.76  |
+| 2.4372        | 10.31 | 4000  | 2.3580          | 19.76  | 6.37   | 19.49  |
+| 2.3727        | 11.6  | 4500  | 2.3595          | 19.96  | 6.52   | 19.68  |
+| 2.3488        | 12.89 | 5000  | 2.3580          | 19.63  | 6.14   | 19.31  |
+| 2.2868        | 14.18 | 5500  | 2.3595          | 19.93  | 6.4    | 19.72  |
+| 2.2268        | 15.46 | 6000  | 2.3632          | 19.95  | 6.13   | 19.55  |
+| 2.2081        | 16.75 | 6500  | 2.3631          | 20.47  | 6.34   | 20.1   |
+| 2.1583        | 18.04 | 7000  | 2.3562          | 20.04  | 6.13   | 19.71  |
+| 2.1178        | 19.33 | 7500  | 2.3615          | 19.55  | 5.8    | 19.1   |
+| 2.0904        | 20.62 | 8000  | 2.3549          | 20.37  | 6.6    | 20.05  |
+| 2.0697        | 21.91 | 8500  | 2.3859          | 20.53  | 6.64   | 20.22  |
+| 2.0256        | 23.2  | 9000  | 2.3715          | 20.75  | 6.54   | 20.33  |
+| 2.0011        | 24.48 | 9500  | 2.3713          | 20.55  | 6.72   | 20.25  |
+| 1.9899        | 25.77 | 10000 | 2.3582          | 19.82  | 5.82   | 19.4   |
+| 1.965         | 27.06 | 10500 | 2.3789          | 20.48  | 5.8    | 20.23  |
+| 1.9518        | 28.35 | 11000 | 2.3822          | 20.03  | 6.07   | 19.67  |
+| 1.9089        | 29.64 | 11500 | 2.3743          | 19.62  | 6.1    | 19.3   |
+### Framework versions
+- Transformers 4.26.1
+- Pytorch 1.13.1+cu116
+- Datasets 2.10.1
+- Tokenizers 0.13.2