Model Description:

Pruned from meta-llama/Meta-Llama-3-8B-Instruct using the LLM-Pruner from LLM-Pruner: On the Structural Pruning of Large Language Models

Done to test viability of LLM-Pruner for task-agnostic, low resource Generative AI for Commercial and Personal Use compared to using out-of-the-box models like meta-llama/Llama-3.2-3B-Instruct

Our presentation slides may be found here

To replicate,

  1. First, clone the official implementation and run:
python llama3.py --pruning_ratio 0.25 \
                 --device cuda --eval_device cuda \
                 --base_model meta-llama/Meta-Llama-3-8B-Instruct \
                 --block_wise --block_mlp_layer_start 4 --block_mlp_layer_end 30 \
                 --block_attention_layer_start 4 --block_attention_layer_end 30 \
                 --save_ckpt_log_name llama3_prune \
                 --pruner_type taylor --taylor param_first \
                 --max_seq_len 512 \
                 --test_after_train --test_before_train --save_model 

to get the pruned model.

NOTE:

  • We removed 'ptb' from the datasets in llama3.py since it requires foreign code to load.
  • We change get_examples in llama3.py to use 'c4' since bookcorpus requires foreign code to load.
  1. Then, to post-train, follow the official implementation, section 2

Benchmark Results

Benchmark Evaluation: The model follows the original paper's evaluation and perform zero-shot task classification on 5 common sense reasoning datasets that doesn't require foreign code to load:

Model BoolQ HellaSwag ARC-e ARC-c OBQA Average Accuracy
Llama-3-6.6B-LLM-Pruned 70.86 67.64 73.82 44.28 37.6 58.84

Usage:

Follow the official implementation for usage, section Pruned Model with Post-Training.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for moiduy04/Llama-3-6.6B-LLM-Pruned

Finetuned
(535)
this model

Dataset used to train moiduy04/Llama-3-6.6B-LLM-Pruned