mqy
/

mt5-small-text-sum-3

text2text-generation

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

mt5-small-text-sum-3 / README.md

mqy's picture

mqy

update model card README.md

c547e0c about 2 years ago

|

history blame contribute delete

2.52 kB

	---
	license: apache-2.0
	tags:
	- summarization
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: mt5-small-text-sum-3
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mt5-small-text-sum-3

	This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.3392
	- Rouge1: 21.71
	- Rouge2: 6.65
	- Rougel: 21.31

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 10
	- eval_batch_size: 10
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 40

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|
	\| 4.6563 \| 1.61 \| 500 \| 2.5975 \| 16.78 \| 5.15 \| 16.64 \|
	\| 3.1112 \| 3.22 \| 1000 \| 2.4856 \| 17.05 \| 5.31 \| 16.8 \|
	\| 2.876 \| 4.82 \| 1500 \| 2.4217 \| 18.1 \| 5.36 \| 17.85 \|
	\| 2.7557 \| 6.43 \| 2000 \| 2.4423 \| 18.65 \| 5.76 \| 18.27 \|
	\| 2.6327 \| 8.04 \| 2500 \| 2.4024 \| 19.44 \| 6.02 \| 19.16 \|
	\| 2.5444 \| 9.65 \| 3000 \| 2.3581 \| 18.76 \| 5.58 \| 18.4 \|
	\| 2.4373 \| 11.25 \| 3500 \| 2.3654 \| 19.87 \| 6.48 \| 19.43 \|
	\| 2.4058 \| 12.86 \| 4000 \| 2.3767 \| 19.87 \| 5.96 \| 19.43 \|
	\| 2.3404 \| 14.47 \| 4500 \| 2.3602 \| 20.01 \| 5.94 \| 19.64 \|
	\| 2.2882 \| 16.08 \| 5000 \| 2.3392 \| 21.71 \| 6.65 \| 21.31 \|
	\| 2.2263 \| 17.68 \| 5500 \| 2.3520 \| 20.31 \| 6.3 \| 20.04 \|
	\| 2.1948 \| 19.29 \| 6000 \| 2.3699 \| 21.2 \| 6.84 \| 20.81 \|
	\| 2.154 \| 20.9 \| 6500 \| 2.3472 \| 20.39 \| 5.82 \| 19.94 \|
	\| 2.1218 \| 22.51 \| 7000 \| 2.3679 \| 20.07 \| 6.38 \| 19.69 \|
	\| 2.073 \| 24.12 \| 7500 \| 2.3457 \| 19.7 \| 5.8 \| 19.2 \|


	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.13.1+cu116
	- Datasets 2.10.1
	- Tokenizers 0.13.2