File size: 3,941 Bytes

9eef827
 
 
 
 
 
 
 
 
 
 
 
16abfc6
 
 
 
 
9eef827
 
 
16abfc6
9eef827
16abfc6
9eef827
 
 
 
 
 
 
 
 
 
16abfc6
9eef827
 
 
16abfc6
9eef827
 
 
16abfc6
 
 
 
be3042b
16abfc6
 
 
2c2059d
 
16abfc6
 
 
 
 
 
 
9eef827
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16abfc6

---
library_name: transformers
license: apache-2.0
base_model: google/mt5-small
tags:
- summarization
- generated_from_trainer
metrics:
- rouge
model-index:
- name: mt5-small
  results: []
datasets:
- srvmishra832/multilingual-amazon-reviews-6-languages
language:
- en
- de
---


# Amazon_MultiLingual_Review_Summarization_with_google_mT5_small

This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an Multi Lingual Amazon Reviews dataset.
It achieves the following results on the evaluation set:
- Loss: 2.9368
- Model Preparation Time: 0.0038
- Rouge1: 16.1955
- Rouge2: 8.1292
- Rougel: 15.9218
- Rougelsum: 15.9516

## Model description

[google/mt5-small](https://huggingface.co/google/mt5-small)

## Intended uses & limitations

Multilingual Product Review Summarization. Supported Languages: English and German

## Training and evaluation data

The original multi-lingual Amazon product reviews dataset available on HuggingFace is defunct.

So, we use the version available at [Kaggle](https://www.kaggle.com/datasets/mexwell/amazon-reviews-multi).

The original dataset supports 6 languages: English, German, French, Spanish, Japanese, and Chamorro.

Each language has 20,000 training samples, 5,000 validation samples, and 5,000 testing samples.

We upload this dataset to HuggingFace hub at [srvmishra832/multilingual-amazon-reviews-6-languages](https://huggingface.co/datasets/srvmishra832/multilingual-amazon-reviews-6-languages)

Here, we only select the English and German language reviews for the `pc` and `electronics` product categories.

We use the review titles as summaries, and to prevent the model from generating very small summaries, we filter out those examples with extremely short review titles. 

Finally, we downsample the resulting dataset so that training is feasible on the Google colab T4 GPU in a reasonable amount of time.

The final downsampled and concatenated dataset contains 8,000 training samples, 452 validation samples, and 422 test samples.

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10

### Training results

| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Rouge1  | Rouge2 | Rougel  | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:----------------------:|:-------:|:------:|:-------:|:---------:|
| 9.0889        | 1.0   | 500  | 3.4117          | 0.0038                 | 12.541  | 5.1023 | 11.9039 | 11.8749   |
| 4.3977        | 2.0   | 1000 | 3.1900          | 0.0038                 | 15.342  | 6.747  | 14.9223 | 14.8598   |
| 3.9595        | 3.0   | 1500 | 3.0817          | 0.0038                 | 15.3976 | 6.2063 | 15.0635 | 15.069    |
| 3.7525        | 4.0   | 2000 | 3.0560          | 0.0038                 | 15.7991 | 6.8536 | 15.4657 | 15.5263   |
| 3.6191        | 5.0   | 2500 | 3.0048          | 0.0038                 | 16.3791 | 7.3671 | 16.0817 | 16.059    |
| 3.5155        | 6.0   | 3000 | 2.9779          | 0.0038                 | 16.2311 | 7.5629 | 15.7492 | 15.758    |
| 3.4497        | 7.0   | 3500 | 2.9663          | 0.0038                 | 16.2554 | 8.1464 | 15.9499 | 15.9152   |
| 3.3889        | 8.0   | 4000 | 2.9438          | 0.0038                 | 16.5764 | 8.3698 | 16.3225 | 16.2848   |
| 3.3656        | 9.0   | 4500 | 2.9365          | 0.0038                 | 16.1416 | 8.0266 | 15.8921 | 15.8913   |
| 3.3562        | 10.0  | 5000 | 2.9368          | 0.0038                 | 16.1955 | 8.1292 | 15.9218 | 15.9516   |


### Framework versions

- Transformers 4.50.0
- Pytorch 2.6.0+cu124
- Datasets 3.4.1
- Tokenizers 0.21.1