metadata

license: other
library_name: peft
tags:
  - llama-factory
  - lora
  - generated_from_trainer
base_model: /data1/model/llama2/meta-llama/Llama2-13b
model-index:
  - name: topical_chat_no_sys
    results: []

topical_chat_no_sys

This model is a fine-tuned version of /data1/model/llama2/meta-llama/Llama2-13b on the topical_chat_no_sys dataset. It achieves the following results on the evaluation set:

Loss: 1.8941

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 2
total_train_batch_size: 8
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 20
num_epochs: 5.0

Training results

Training Loss	Epoch	Step	Validation Loss
2.1904	0.0472	100	2.1137
1.9627	0.0944	200	2.0589
2.0172	0.1416	300	2.0221
1.8965	0.1889	400	1.9968
1.9534	0.2361	500	1.9823
1.8621	0.2833	600	1.9679
1.9777	0.3305	700	1.9611
2.0865	0.3777	800	1.9544
1.9662	0.4249	900	1.9461
1.8352	0.4721	1000	1.9376
1.8973	0.5194	1100	1.9329
1.9688	0.5666	1200	1.9264
1.8383	0.6138	1300	1.9192
1.9032	0.6610	1400	1.9146
1.9295	0.7082	1500	1.9109
1.8207	0.7554	1600	1.9061
1.9119	0.8026	1700	1.9032
1.8392	0.8499	1800	1.9019
1.961	0.8971	1900	1.8994
1.8913	0.9443	2000	1.8945
1.8187	0.9915	2100	1.8941
1.7296	1.0387	2200	1.9006
1.6184	1.0859	2300	1.9040
1.6973	1.1331	2400	1.9056

Framework versions

PEFT 0.10.0
Transformers 4.40.0
Pytorch 2.2.1
Datasets 2.18.0
Tokenizers 0.19.1