satreysa commited on
Commit
adce840
·
verified ·
1 Parent(s): 0f6105f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -29
README.md CHANGED
@@ -16,39 +16,14 @@ tags:
16
 
17
  # meta-llama/Llama-2-7b-chat-hf
18
  - ## Introduction
19
- - Quantization Tool: Quark 0.6.0
20
- - OGA Model Builder: v0.5.1
21
- - Postprocess
22
  - ## Quantization Strategy
23
  - AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
24
  - Excluded Layers: None
25
- ```
26
- python3 quantize_quark.py \
27
- --model_dir "$model" \
28
- --output_dir "$output_dir" \
29
- --quant_scheme w_uint4_per_group_asym \
30
- --num_calib_data 128 \
31
- --quant_algo awq \
32
- --dataset pileval_for_awq_benchmark \
33
- --seq_len 512 \
34
- --model_export quark_safetensors \
35
- --data_type float16 \
36
- --exclude_layers [] \
37
- --custom_mode awq
38
- ```
39
- - ## OGA Model Builder
40
- ```
41
- python builder.py \
42
- -i <quantized safetensor model dir> \
43
- -o <oga model output dir> \
44
- -p int4 \
45
- -e dml
46
- ```
47
- - PostProcessed to generate Hybrid Model
48
- -
49
  - ## Quick Start
50
- For quickstart, refer to hybrid-llm-artifacts_1.3.0.zip available in [RyzenAI-SW-EA](https://account.amd.com/en/member/ryzenai-sw-ea.html)
51
-
52
 
53
  #### Evaluation scores
54
  The perplexity measurement is run on the wikitext-2-raw-v1 (raw data) dataset provided by Hugging Face. Perplexity score measured for prompt length 2k is 7.1518.
 
16
 
17
  # meta-llama/Llama-2-7b-chat-hf
18
  - ## Introduction
19
+ This model was prepared using the AMD Quark Quantization tool, followed by necessary post-processing.
20
+
 
21
  - ## Quantization Strategy
22
  - AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
23
  - Excluded Layers: None
24
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  - ## Quick Start
26
+ For quickstart, refer to [Ryzen AI doucmentation](https://ryzenai.docs.amd.com/en/latest/hybrid_oga.html)
 
27
 
28
  #### Evaluation scores
29
  The perplexity measurement is run on the wikitext-2-raw-v1 (raw data) dataset provided by Hugging Face. Perplexity score measured for prompt length 2k is 7.1518.