shivanandmn's picture
Model save
092b6c5 verified
|
raw
history blame
4.48 kB
metadata
library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - bleu
model-index:
  - name: duo-predict-gpt2-medium-wikitext
    results: []

duo-predict-gpt2-medium-wikitext

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5713
  • Accuracy: 0.0073
  • Perplexity: 4.8128
  • Bleu: 1.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Accuracy Perplexity Bleu
7.6316 0.1403 500 3.7041 0.0073 40.6115 1.0
6.4879 0.2807 1000 3.1196 0.0073 22.6384 0.9995
5.3189 0.4210 1500 2.5976 0.0073 13.4321 1.0
4.7557 0.5613 2000 2.3369 0.0073 10.3487 1.0
4.351 0.7017 2500 2.1234 0.0073 8.3592 1.0
4.0701 0.8420 3000 2.0021 0.0073 7.4045 1.0
3.9112 0.9823 3500 1.9280 0.0073 6.8756 1.0
3.7775 1.1226 4000 1.8769 0.0073 6.5331 1.0
3.703 1.2630 4500 1.8369 0.0073 6.2768 1.0
3.6443 1.4033 5000 1.8061 0.0073 6.0866 1.0
3.5659 1.5436 5500 1.7789 0.0073 5.9232 1.0
3.5303 1.6840 6000 1.7552 0.0073 5.7848 1.0
3.4926 1.8243 6500 1.7348 0.0073 5.6676 1.0
3.464 1.9646 7000 1.7167 0.0073 5.5660 1.0
3.3432 2.1050 7500 1.7016 0.0073 5.4826 1.0
3.3215 2.2453 8000 1.6883 0.0073 5.4104 1.0
3.3122 2.3856 8500 1.6768 0.0073 5.3483 1.0
3.2836 2.5260 9000 1.6651 0.0073 5.2860 1.0
3.2582 2.6663 9500 1.6541 0.0073 5.2281 1.0
3.2387 2.8066 10000 1.6434 0.0073 5.1726 1.0
3.223 2.9470 10500 1.6338 0.0073 5.1232 1.0
3.126 3.0873 11000 1.6289 0.0073 5.0984 1.0
3.1149 3.2276 11500 1.6212 0.0073 5.0590 1.0
3.1048 3.3679 12000 1.6149 0.0073 5.0276 1.0
3.0966 3.5083 12500 1.6074 0.0073 4.9897 1.0
3.0904 3.6486 13000 1.6013 0.0073 4.9597 1.0
3.0979 3.7889 13500 1.5959 0.0073 4.9326 1.0
3.0772 3.9293 14000 1.5907 0.0073 4.9074 1.0
2.9795 4.0696 14500 1.5887 0.0073 4.8975 1.0
2.9807 4.2099 15000 1.5847 0.0073 4.8779 1.0
2.976 4.3503 15500 1.5817 0.0073 4.8634 1.0
2.9795 4.4906 16000 1.5778 0.0073 4.8444 1.0
2.9594 4.6309 16500 1.5754 0.0073 4.8328 1.0
2.9698 4.7713 17000 1.5731 0.0073 4.8217 1.0
2.9604 4.9116 17500 1.5713 0.0073 4.8128 1.0

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0