metadata
license: cc-by-nc-sa-4.0
tags:
- generated_from_trainer
- simplification
task_categories:
- text2text-generation
task_ids:
- text-simplification
language:
- nl
datasets:
- BramVanroy/chatgpt-dutch-simplification
metrics:
- rouge
- sari
ul2-small-dutch-simplification-mai-2023
This model is intended to simplify Dutch sentences.
This model is a fine-tuned version of yhavinga/ul2-small-dutch on the BramVanroy/chatgpt-dutch-simplification dataset.
The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial Intelligence (MAI) at KU Leuven in 2023. Dataset creation by Charlotte, model training by Bram.
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0006370158604635734
- train_batch_size: 20
- optimizer: Adafactor
- lr_scheduler_type: linear
- num_epochs: 37
Training results
eval
results are on the evaluation set, predict
results are on the test set.
{
"eval_gen_len": 21.555555555555557,
"eval_loss": 3.2290523052215576,
"eval_rouge1": 40.9663,
"eval_rouge2": 18.499,
"eval_rougeL": 34.9342,
"eval_rougeLsum": 34.9752,
"eval_sari": 52.4509,
"predict_gen_len": 21.796875,
"predict_loss": 3.063812494277954,
"predict_rouge1": 39.6138,
"predict_rouge2": 17.1242,
"predict_rougeL": 35.4629,
"predict_rougeLsum": 35.3679,
"predict_sari": 51.7538
}
Framework versions
- Transformers 4.29.2
- Pytorch 2.0.1+cu117
- Datasets 2.12.0
- Tokenizers 0.13.3