Update README.md
Browse files
README.md
CHANGED
@@ -6,39 +6,14 @@ base_model:
|
|
6 |
|
7 |
# meta-llama/Llama-3.2-1B-Instruct
|
8 |
- ## Introduction
|
9 |
-
|
10 |
-
|
11 |
-
- Postprocess
|
12 |
- ## Quantization Strategy
|
13 |
- AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
|
14 |
- Excluded Layers: None
|
15 |
-
|
16 |
-
python3 quantize_quark.py \
|
17 |
-
--model_dir "$model" \
|
18 |
-
--output_dir "$output_dir" \
|
19 |
-
--quant_scheme w_uint4_per_group_asym \
|
20 |
-
--num_calib_data 128 \
|
21 |
-
--quant_algo awq \
|
22 |
-
--dataset pileval_for_awq_benchmark \
|
23 |
-
--seq_len 512 \
|
24 |
-
--model_export quark_safetensors \
|
25 |
-
--data_type float16 \
|
26 |
-
--exclude_layers [] \
|
27 |
-
--custom_mode awq
|
28 |
-
```
|
29 |
-
- ## OGA Model Builder
|
30 |
-
```
|
31 |
-
python builder.py \
|
32 |
-
-i <quantized safetensor model dir> \
|
33 |
-
-o <oga model output dir> \
|
34 |
-
-p int4 \
|
35 |
-
-e dml
|
36 |
-
```
|
37 |
-
- PostProcessed to generate Hybrid Model
|
38 |
-
-
|
39 |
- ## Quick Start
|
40 |
-
For quickstart, refer to
|
41 |
-
|
42 |
|
43 |
#### Evaluation scores
|
44 |
The perplexity measurement is run on the wikitext-2-raw-v1 (raw data) dataset provided by Hugging Face. Perplexity score measured for prompt length 2k is 15.3878.
|
|
|
6 |
|
7 |
# meta-llama/Llama-3.2-1B-Instruct
|
8 |
- ## Introduction
|
9 |
+
This model was prepared using the AMD Quark Quantization tool, followed by necessary post-processing.
|
10 |
+
|
|
|
11 |
- ## Quantization Strategy
|
12 |
- AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
|
13 |
- Excluded Layers: None
|
14 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
- ## Quick Start
|
16 |
+
For quickstart, refer to [Ryzen AI doucmentation](https://ryzenai.docs.amd.com/en/latest/hybrid_oga.html)
|
|
|
17 |
|
18 |
#### Evaluation scores
|
19 |
The perplexity measurement is run on the wikitext-2-raw-v1 (raw data) dataset provided by Hugging Face. Perplexity score measured for prompt length 2k is 15.3878.
|