gpt2 / README.md
joshcarp's picture
Model save
b0f7a08 verified
|
raw
history blame
2.22 kB
metadata
tags:
  - generated_from_trainer
model-index:
  - name: gpt2
    results: []

gpt2

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9972

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 9 1.4794
No log 2.0 18 1.3341
No log 3.0 27 1.2456
No log 4.0 36 1.1770
No log 5.0 45 1.1306
No log 6.0 54 1.0913
No log 7.0 63 1.0663
No log 8.0 72 1.0500
No log 9.0 81 1.0234
No log 10.0 90 1.0236
No log 11.0 99 1.0115
No log 12.0 108 1.0051
No log 13.0 117 1.0088
No log 14.0 126 1.0020
No log 15.0 135 1.0091
No log 16.0 144 0.9978
No log 17.0 153 0.9981
No log 18.0 162 1.0026
No log 19.0 171 0.9921
No log 20.0 180 0.9976
No log 21.0 189 0.9958
No log 22.0 198 0.9972

Framework versions

  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1