File size: 3,163 Bytes

2d4619e

---
license: apache-2.0
tags:
- summarization
- generated_from_trainer
metrics:
- rouge
model-index:
- name: mt5-small-text-sum-1
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# mt5-small-text-sum-1

This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 2.3715
- Rouge1: 20.75
- Rouge2: 6.54
- Rougel: 20.33

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 40

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Rouge1 | Rouge2 | Rougel |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|
| 4.936         | 1.29  | 500   | 2.6226          | 15.38  | 5.14   | 15.22  |
| 3.1573        | 2.58  | 1000  | 2.5081          | 18.02  | 5.53   | 17.8   |
| 2.9258        | 3.87  | 1500  | 2.4499          | 17.19  | 5.3    | 17.0   |
| 2.7786        | 5.15  | 2000  | 2.4264          | 18.17  | 5.02   | 17.99  |
| 2.6786        | 6.44  | 2500  | 2.4088          | 17.98  | 5.48   | 17.6   |
| 2.5824        | 7.73  | 3000  | 2.3909          | 19.43  | 6.32   | 19.07  |
| 2.5261        | 9.02  | 3500  | 2.3691          | 19.06  | 5.94   | 18.76  |
| 2.4372        | 10.31 | 4000  | 2.3580          | 19.76  | 6.37   | 19.49  |
| 2.3727        | 11.6  | 4500  | 2.3595          | 19.96  | 6.52   | 19.68  |
| 2.3488        | 12.89 | 5000  | 2.3580          | 19.63  | 6.14   | 19.31  |
| 2.2868        | 14.18 | 5500  | 2.3595          | 19.93  | 6.4    | 19.72  |
| 2.2268        | 15.46 | 6000  | 2.3632          | 19.95  | 6.13   | 19.55  |
| 2.2081        | 16.75 | 6500  | 2.3631          | 20.47  | 6.34   | 20.1   |
| 2.1583        | 18.04 | 7000  | 2.3562          | 20.04  | 6.13   | 19.71  |
| 2.1178        | 19.33 | 7500  | 2.3615          | 19.55  | 5.8    | 19.1   |
| 2.0904        | 20.62 | 8000  | 2.3549          | 20.37  | 6.6    | 20.05  |
| 2.0697        | 21.91 | 8500  | 2.3859          | 20.53  | 6.64   | 20.22  |
| 2.0256        | 23.2  | 9000  | 2.3715          | 20.75  | 6.54   | 20.33  |
| 2.0011        | 24.48 | 9500  | 2.3713          | 20.55  | 6.72   | 20.25  |
| 1.9899        | 25.77 | 10000 | 2.3582          | 19.82  | 5.82   | 19.4   |
| 1.965         | 27.06 | 10500 | 2.3789          | 20.48  | 5.8    | 20.23  |
| 1.9518        | 28.35 | 11000 | 2.3822          | 20.03  | 6.07   | 19.67  |
| 1.9089        | 29.64 | 11500 | 2.3743          | 19.62  | 6.1    | 19.3   |


### Framework versions

- Transformers 4.26.1
- Pytorch 1.13.1+cu116
- Datasets 2.10.1
- Tokenizers 0.13.2