Built with Axolotl

See axolotl config

axolotl version: 0.4.1

adapter: qlora
base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
bf16: auto
dataset_prepared_path: null
datasets:
- path: Taiel26/plm_2500_uniref
  type: alpaca
debug: null
deepspeed: null
early_stopping_patience: null
eval_sample_packing: false
evals_per_epoch: 4
flash_attention: true
fp16: null
fsdp: null
fsdp_config: null
gradient_accumulation_steps: 4
gradient_checkpointing: true
group_by_length: false
learning_rate: 0.0002
load_in_4bit: true
load_in_8bit: false
local_rank: null
logging_steps: 1
lora_alpha: 16
lora_dropout: 0.05
lora_fan_in_fan_out: null
lora_model_dir: null
lora_r: 32
lora_target_linear: true
lora_target_modules: null
lr_scheduler: cosine
micro_batch_size: 2
model_type: LlamaForCausalLM
num_epochs: 4
optimizer: paged_adamw_32bit
output_dir: ./outputs/qlora-out
pad_to_sequence_len: true
resume_from_checkpoint: null
sample_packing: true
saves_per_epoch: 1
sequence_len: 4096
special_tokens: null
strict: false
tf32: false
tokenizer_type: LlamaTokenizer
train_on_inputs: false
val_set_size: 0.05
wandb_entity: null
wandb_log_model: null
wandb_name: null
wandb_project: null
wandb_watch: null
warmup_steps: 10
weight_decay: 0.0
xformers_attention: null

outputs/qlora-out

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8586

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 4

Training results

Training Loss Epoch Step Validation Loss
2.0919 0.0198 1 2.0800
1.5479 0.2574 13 1.5341
1.2083 0.5149 26 1.2245
1.0851 0.7723 39 1.0607
0.9432 1.0297 52 0.9755
0.9007 1.2178 65 0.9334
0.8765 1.4752 78 0.9084
0.8789 1.7327 91 0.8891
0.8304 1.9901 104 0.8779
0.8194 2.1782 117 0.8714
0.848 2.4356 130 0.8665
0.8354 2.6931 143 0.8627
0.8476 2.9505 156 0.8605
0.811 3.1386 169 0.8590
0.8178 3.3960 182 0.8588
0.8073 3.6535 195 0.8586

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.1
  • Pytorch 2.1.2+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Model tree for Taiel26/TinyLLama1.1B_PLM