Safetensors
ParagonLight's picture
update 28 tasks lora adapters
c6dd428
metadata
license: other
library_name: peft
tags:
  - llama-factory
  - lora
  - generated_from_trainer
base_model: /data1/model/llama2/meta-llama/Llama2-13b
model-index:
  - name: topical_chat_no_sys
    results: []

topical_chat_no_sys

This model is a fine-tuned version of /data1/model/llama2/meta-llama/Llama2-13b on the topical_chat_no_sys dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8941

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • total_train_batch_size: 8
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 20
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
2.1904 0.0472 100 2.1137
1.9627 0.0944 200 2.0589
2.0172 0.1416 300 2.0221
1.8965 0.1889 400 1.9968
1.9534 0.2361 500 1.9823
1.8621 0.2833 600 1.9679
1.9777 0.3305 700 1.9611
2.0865 0.3777 800 1.9544
1.9662 0.4249 900 1.9461
1.8352 0.4721 1000 1.9376
1.8973 0.5194 1100 1.9329
1.9688 0.5666 1200 1.9264
1.8383 0.6138 1300 1.9192
1.9032 0.6610 1400 1.9146
1.9295 0.7082 1500 1.9109
1.8207 0.7554 1600 1.9061
1.9119 0.8026 1700 1.9032
1.8392 0.8499 1800 1.9019
1.961 0.8971 1900 1.8994
1.8913 0.9443 2000 1.8945
1.8187 0.9915 2100 1.8941
1.7296 1.0387 2200 1.9006
1.6184 1.0859 2300 1.9040
1.6973 1.1331 2400 1.9056

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.0
  • Pytorch 2.2.1
  • Datasets 2.18.0
  • Tokenizers 0.19.1