amd
/

chatglm3-6b-awq-g128-int4-asym-fp16-onnx-hybrid

Model card Files Files and versions Community

satreysa commited on 11 days ago

Commit

d03d57f

·

verified ·

1 Parent(s): 76a97ea

Update README.md

Files changed (1) hide show

README.md +4 -29

README.md CHANGED Viewed

@@ -11,39 +11,14 @@ base_model: THUDM/chatglm3-6b
 # chatglm3-6b-awq-w-int4-asym-gs128-a-fp16-onnx-ryzen-strix-hybrid
 - ## Introduction
-  - Quantization Tool: Quark 0.6.0
-  - OGA Model Builder: v0.5.1
-  - Postprocess
 - ## Quantization Strategy
   - AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
   - Excluded Layers: None
-  ```
-  python3 quantize_quark.py \
-        --model_dir "$model" \
-        --output_dir "$output_dir" \
-        --quant_scheme w_uint4_per_group_asym \
-        --num_calib_data 128 \
-        --quant_algo awq \
-        --dataset pileval_for_awq_benchmark \
-        --seq_len 512 \
-        --model_export quark_safetensors \
-        --data_type float16 \
-        --exclude_layers [] \
-        --custom_mode awq
-  ```
-- ## OGA Model Builder
-  ```
-  python builder.py \
-    -i <quantized safetensor model dir> \
-    -o <oga model output dir> \
-    -p int4 \
-    -e dml
-  ```
-- PostProcessed to generate Hybrid Model
--
-- ## Quick Start
-For quickstart, refer to hybrid-llm-artifacts_1.3.0.zip available in [RyzenAI-SW-EA](https://account.amd.com/en/member/ryzenai-sw-ea.html)
 #### Evaluation scores
 The perplexity measurement is run on the wikitext-2-raw-v1 (raw data) dataset provided by Hugging Face. Perplexity score measured for prompt length 2k is 29.7801.

 # chatglm3-6b-awq-w-int4-asym-gs128-a-fp16-onnx-ryzen-strix-hybrid
 - ## Introduction
+  This model was prepared using the AMD Quark Quantization tool, followed by necessary post-processing.
 - ## Quantization Strategy
   - AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
   - Excluded Layers: None
+- ## Quick Start
+For quickstart, refer to [Ryzen AI doucmentation](https://ryzenai.docs.amd.com/en/latest/hybrid_oga.html)
 #### Evaluation scores
 The perplexity measurement is run on the wikitext-2-raw-v1 (raw data) dataset provided by Hugging Face. Perplexity score measured for prompt length 2k is 29.7801.