Llama-3.1-8B-Instruct-PsyCourse-fold5

This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct on the course-train-fold5 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.6474 0.0758 50 0.4173
0.0891 0.1517 100 0.0932
0.0787 0.2275 150 0.0860
0.059 0.3033 200 0.0593
0.053 0.3791 250 0.0517
0.0489 0.4550 300 0.0463
0.0525 0.5308 350 0.0471
0.0651 0.6066 400 0.0472
0.0552 0.6825 450 0.0467
0.0317 0.7583 500 0.0416
0.0313 0.8341 550 0.0426
0.0441 0.9100 600 0.0429
0.0436 0.9858 650 0.0390
0.0324 1.0616 700 0.0417
0.0356 1.1374 750 0.0421
0.0368 1.2133 800 0.0378
0.0322 1.2891 850 0.0418
0.0242 1.3649 900 0.0404
0.0278 1.4408 950 0.0395
0.0391 1.5166 1000 0.0356
0.0294 1.5924 1050 0.0360
0.023 1.6682 1100 0.0359
0.0282 1.7441 1150 0.0372
0.0343 1.8199 1200 0.0359
0.0335 1.8957 1250 0.0339
0.0389 1.9716 1300 0.0357
0.0277 2.0474 1350 0.0351
0.0193 2.1232 1400 0.0343
0.0211 2.1991 1450 0.0354
0.0149 2.2749 1500 0.0352
0.0258 2.3507 1550 0.0337
0.0255 2.4265 1600 0.0359
0.014 2.5024 1650 0.0377
0.0265 2.5782 1700 0.0336
0.0211 2.6540 1750 0.0344
0.0278 2.7299 1800 0.0355
0.0253 2.8057 1850 0.0363
0.0178 2.8815 1900 0.0345
0.0302 2.9573 1950 0.0340
0.0091 3.0332 2000 0.0358
0.0072 3.1090 2050 0.0411
0.0126 3.1848 2100 0.0394
0.0108 3.2607 2150 0.0404
0.0101 3.3365 2200 0.0381
0.0077 3.4123 2250 0.0382
0.0096 3.4882 2300 0.0379
0.0077 3.5640 2350 0.0392
0.015 3.6398 2400 0.0381
0.0095 3.7156 2450 0.0401
0.0174 3.7915 2500 0.0395
0.0105 3.8673 2550 0.0393
0.014 3.9431 2600 0.0385
0.0086 4.0190 2650 0.0402
0.005 4.0948 2700 0.0439
0.0044 4.1706 2750 0.0487
0.0032 4.2464 2800 0.0490
0.0052 4.3223 2850 0.0489
0.0089 4.3981 2900 0.0493
0.0095 4.4739 2950 0.0490
0.0049 4.5498 3000 0.0487
0.0025 4.6256 3050 0.0492
0.0066 4.7014 3100 0.0498
0.0089 4.7773 3150 0.0500
0.0021 4.8531 3200 0.0500
0.0056 4.9289 3250 0.0499

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for chchen/Llama-3.1-8B-Instruct-PsyCourse-fold5

Adapter
(819)
this model