a9f95564-8108-4018-b0a8-c1519cc5a6bb

This model is a fine-tuned version of unsloth/tinyllama-chat on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.000203
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 128
optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
lr_scheduler_warmup_steps: 50
training_steps: 400

Training Loss	Epoch	Step	Validation Loss
No log	0.0051	1	2.7134
0.2963	0.2528	50	0.1043
0.1056	0.5057	100	0.1043
0.105	0.7585	150	0.1033
0.1054	1.0126	200	0.1043
0.1045	1.2655	250	0.1060
0.1043	1.5183	300	0.1068