Llama-3.1-8B-Instruct-PsyCourse-fold4

This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct on the course-train-fold4 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0344

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.5681 0.0763 50 0.3814
0.122 0.1527 100 0.0796
0.0541 0.2290 150 0.0592
0.0502 0.3053 200 0.0522
0.0666 0.3816 250 0.0539
0.046 0.4580 300 0.0493
0.0458 0.5343 350 0.0527
0.0448 0.6106 400 0.0488
0.0567 0.6870 450 0.0462
0.0358 0.7633 500 0.0410
0.0445 0.8396 550 0.0407
0.0462 0.9159 600 0.0407
0.0363 0.9923 650 0.0410
0.0343 1.0686 700 0.0370
0.0413 1.1449 750 0.0378
0.0322 1.2213 800 0.0398
0.0342 1.2976 850 0.0385
0.0337 1.3739 900 0.0436
0.0295 1.4502 950 0.0373
0.0267 1.5266 1000 0.0386
0.0287 1.6029 1050 0.0380
0.0504 1.6792 1100 0.0388
0.0317 1.7556 1150 0.0391
0.0448 1.8319 1200 0.0366
0.0278 1.9082 1250 0.0362
0.0347 1.9845 1300 0.0344
0.0201 2.0609 1350 0.0355
0.0238 2.1372 1400 0.0357
0.0299 2.2135 1450 0.0371
0.0155 2.2899 1500 0.0384
0.0157 2.3662 1550 0.0391
0.0222 2.4425 1600 0.0370
0.0245 2.5188 1650 0.0360
0.0206 2.5952 1700 0.0376
0.0198 2.6715 1750 0.0363
0.0209 2.7478 1800 0.0370
0.026 2.8242 1850 0.0362
0.0197 2.9005 1900 0.0358
0.0291 2.9768 1950 0.0355
0.0091 3.0531 2000 0.0416
0.0132 3.1295 2050 0.0421
0.0115 3.2058 2100 0.0443
0.0131 3.2821 2150 0.0459
0.0132 3.3585 2200 0.0409
0.0077 3.4348 2250 0.0445
0.0156 3.5111 2300 0.0444
0.0125 3.5874 2350 0.0480
0.0089 3.6638 2400 0.0499
0.0125 3.7401 2450 0.0467
0.0115 3.8164 2500 0.0447
0.0062 3.8928 2550 0.0449
0.0112 3.9691 2600 0.0462
0.005 4.0454 2650 0.0465
0.0065 4.1217 2700 0.0502
0.0021 4.1981 2750 0.0543
0.0033 4.2744 2800 0.0556
0.0068 4.3507 2850 0.0572
0.0015 4.4271 2900 0.0599
0.0036 4.5034 2950 0.0602
0.0027 4.5797 3000 0.0615
0.0013 4.6560 3050 0.0615
0.0056 4.7324 3100 0.0618
0.0028 4.8087 3150 0.0618
0.0044 4.8850 3200 0.0620
0.0061 4.9614 3250 0.0622

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for chchen/Llama-3.1-8B-Instruct-PsyCourse-fold4

Adapter
(819)
this model