Built with Axolotl

f3efe9e5-0483-461a-a1f2-dda804f44bfc

This model is a fine-tuned version of llamafactory/tiny-random-Llama-3 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 11.6989

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.000206
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 50
  • training_steps: 500

Training results

Training Loss Epoch Step Validation Loss
No log 0.0000 1 11.7603
11.7235 0.0016 50 11.7388
11.7004 0.0033 100 11.7237
11.6812 0.0049 150 11.7155
11.6724 0.0066 200 11.7087
11.6684 0.0082 250 11.7047
11.6699 0.0098 300 11.7018
11.6698 0.0115 350 11.7002
11.6597 0.0131 400 11.6993
11.6519 0.0148 450 11.6989
11.6638 0.0164 500 11.6989

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
9
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for lesso06/f3efe9e5-0483-461a-a1f2-dda804f44bfc

Adapter
(283)
this model