Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ metrics:
|
|
5 |
- accuracy
|
6 |
- perplexity
|
7 |
base_model:
|
8 |
-
- meta-llama/Llama-3.2-
|
9 |
---
|
10 |
|
11 |
# Model Card for oopere/pruned20-llama-1b
|
@@ -31,20 +31,22 @@ This model is not intended to be used directly, but rather to be fine-tuned for
|
|
31 |
|
32 |
| Benchmark | Original Model | Pruned Model | Relative Change |
|
33 |
| ---- | ---- | ---- | ---- |
|
34 |
-
| ARC-Easy | 65.19% |
|
35 |
-
| BoolQ | 64.16% |
|
36 |
-
| LAMBADA-OpenAI | 62.20% |
|
37 |
-
| LAMBADA-Standard | 53.46% |
|
38 |
|
39 |
### Key Findings
|
40 |
-
-
|
41 |
-
-
|
42 |
-
-
|
|
|
43 |
|
44 |
### Limitations
|
45 |
-
- Reduced performance on tasks requiring complex
|
46 |
-
-
|
47 |
-
-
|
|
|
48 |
|
49 |
### Implementation Details
|
50 |
- **Pruning Notebook:** [Detailed implementation and methodology](https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/6-PRUNING/6_3_pruning_structured_llama3.2-1b_OK.ipynb)
|
@@ -52,13 +54,13 @@ This model is not intended to be used directly, but rather to be fine-tuned for
|
|
52 |
|
53 |
### Pruning Method
|
54 |
- **Technique:** Structured pruning targeting MLP layers
|
55 |
-
- **Pruning Ratio:**
|
56 |
- **Selection Criteria:** Importance scoring based on absolute maximum weights
|
57 |
- **Architecture Specifics:** Maintained GLU structure during pruning
|
58 |
|
59 |
### Hardware Requirements
|
60 |
- Reduced memory footprint compared to original model
|
61 |
-
- Can run on hardware with ~
|
62 |
|
63 |
## Acknowledgments
|
64 |
- Thanks to [Mariusz Kurman](https://huggingface.co/mkurman) for creating [llama-pruning](https://github.com/MedITSolutionsKurman/llama-pruning), a library that extends and improve this pruning methodology.
|
|
|
5 |
- accuracy
|
6 |
- perplexity
|
7 |
base_model:
|
8 |
+
- meta-llama/Llama-3.2-3B
|
9 |
---
|
10 |
|
11 |
# Model Card for oopere/pruned20-llama-1b
|
|
|
31 |
|
32 |
| Benchmark | Original Model | Pruned Model | Relative Change |
|
33 |
| ---- | ---- | ---- | ---- |
|
34 |
+
| ARC-Easy | 65.19% | 60.69% | -6.9% |
|
35 |
+
| BoolQ | 64.16% | 51.22% | -20.2% |
|
36 |
+
| LAMBADA-OpenAI | 62.20% | 59.64% | -4.1% |
|
37 |
+
| LAMBADA-Standard | 53.46% | 54.61% | -2.2% |
|
38 |
|
39 |
### Key Findings
|
40 |
+
- Surprisingly, an improvement is observed on the LAMBADA-Standard benchmark, with a 2.2% relative increase in accuracy.
|
41 |
+
- Maintains competitive performance on binary classification tasks (BoolQ), with a 20.2% relative decrease in accuracy.
|
42 |
+
- Moderate degradation observed on reasoning tasks (ARC-Easy), with a 6.9% relative decrease in accuracy.
|
43 |
+
- Minimal impact on long-range comprehension (LAMBADA-OpenAI), with only a 4.1% relative decrease in accuracy.
|
44 |
|
45 |
### Limitations
|
46 |
+
- Reduced performance on tasks requiring complex reasoning, with moderate degradation observed on benchmarks like ARC-Easy.
|
47 |
+
- Noticeable decrease in accuracy on binary classification tasks, as seen in BoolQ.
|
48 |
+
- Mixed results on long-range dependencies, with minimal degradation on LAMBADA-OpenAI but variability across benchmarks.
|
49 |
+
- May not be suitable for applications requiring consistently high accuracy across diverse language tasks.
|
50 |
|
51 |
### Implementation Details
|
52 |
- **Pruning Notebook:** [Detailed implementation and methodology](https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/6-PRUNING/6_3_pruning_structured_llama3.2-1b_OK.ipynb)
|
|
|
54 |
|
55 |
### Pruning Method
|
56 |
- **Technique:** Structured pruning targeting MLP layers
|
57 |
+
- **Pruning Ratio:** 10% of neurons removed from MLP layers
|
58 |
- **Selection Criteria:** Importance scoring based on absolute maximum weights
|
59 |
- **Architecture Specifics:** Maintained GLU structure during pruning
|
60 |
|
61 |
### Hardware Requirements
|
62 |
- Reduced memory footprint compared to original model
|
63 |
+
- Can run on hardware with ~10% less memory than original
|
64 |
|
65 |
## Acknowledgments
|
66 |
- Thanks to [Mariusz Kurman](https://huggingface.co/mkurman) for creating [llama-pruning](https://github.com/MedITSolutionsKurman/llama-pruning), a library that extends and improve this pruning methodology.
|