DALL-E-2024-08-08-05-52-48-Craft-an-epic-and-historic-image-for-a-model-card-blending-elements-of-an

Model Card for Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-2.0

Model Details

Model Description

  • Finetuned from model:[Na0s/Llama-3.1-8b-Pruned-4-Layers-1.0]

Training Details

model = FastLanguageModel.get_peft_model(
model,
r = 4, 
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                  "gate_proj", "up_proj", "down_proj",],
lora_alpha = 4,
lora_dropout = 0.05, 
bias = "none",    

use_gradient_checkpointing = "unsloth", 
random_state = 3407,
use_rslora = False,  
loftq_config = None, 
)

from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
train_dataset = dataset,
dataset_text_field = "completion",
max_seq_length = max_seq_length,
dataset_num_proc = 2,
packing = False, 
args = TrainingArguments(
    per_device_train_batch_size = 10,
    gradient_accumulation_steps = 4,
    warmup_steps = 5,
    max_steps=5000,
    learning_rate = 2e-4,
    fp16 = not is_bfloat16_supported(),
    bf16 = is_bfloat16_supported(),
    logging_steps = 1,
    optim = "adamw_8bit",
    weight_decay = 0.01,
    lr_scheduler_type = "cosine",
    seed = 3407,
    output_dir = "outputs_4",
    push_to_hub=True,
    hub_always_push=True,
),
)

Training Data

[meta-math/MetaMathQA]

Evaluation

MMLU Pro 0-shot: 0.2872

Evaluation Data

[TIGER-AI-Lab/MMLU-Pro]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Downloads last month
14
Safetensors
Model size
6.94B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-2.0

Dataset used to train Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-2.0

Collection including Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-2.0