OPEA
/

DeepSeek-V2-Lite-int4-sym-inc

4-bit precision

intel/auto-round

Model card Files Files and versions Community

cicdatopea commited on 17 days ago

Commit

ab0561e

·

verified ·

1 Parent(s): c10c898

Update README.md

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -7,6 +7,7 @@ base_model:
 ## Model Details
 This model is an int4 model with group_size 64 and symmetric quantization of  [deepseek-ai/DeepSeek-V2-Lite](https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite) generated by [intel/auto-round](https://github.com/intel/auto-round).  Please follow the license of the origin model
 ### INT4 Inference(CPU/HPU/CUDA)
@@ -22,7 +23,9 @@ tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, trust_remote_code
 model = AutoModelForCausalLM.from_pretrained(quantized_model_dir,
                                              trust_remote_code=True,
                                              torch_dtype=torch.float16,
-                                             device_map="auto")
 model.generation_config = GenerationConfig.from_pretrained(quantized_model_dir)
 model.generation_config.pad_token_id = model.generation_config.eos_token_id
 prompt = "There is a girl who likes adventure,"
@@ -134,7 +137,7 @@ Determined to save his people and restore the kingdom to its former glory, Princ
 pip3 install lm-eval==0.4.5.
 ```bash
-auto-round --model "OPEA/internlm2_5-7b-chat-int4-sym-inc" --eval --eval_bs 16  --tasks leaderboard_ifeval,leaderboard_mmlu_pro,gsm8k,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,cmmlu,ceval-valid
 ```
 | Metric                       |           BF16           |           INT4           |

 ## Model Details
 This model is an int4 model with group_size 64 and symmetric quantization of  [deepseek-ai/DeepSeek-V2-Lite](https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite) generated by [intel/auto-round](https://github.com/intel/auto-round).  Please follow the license of the origin model
+Load the model with revision="90bb8ef" to use AutoGPTQ format.
 ### INT4 Inference(CPU/HPU/CUDA)
 model = AutoModelForCausalLM.from_pretrained(quantized_model_dir,
                                              trust_remote_code=True,
                                              torch_dtype=torch.float16,
+                                             device_map="auto",
+                                             ##revision="90bb8ef" ## autogptq format
+                                             )
 model.generation_config = GenerationConfig.from_pretrained(quantized_model_dir)
 model.generation_config.pad_token_id = model.generation_config.eos_token_id
 prompt = "There is a girl who likes adventure,"
 pip3 install lm-eval==0.4.5.
 ```bash
+auto-round --model "OPEA/DeepSeek-V2-Lite-int4-sym-inc" --eval --eval_bs 16  --tasks leaderboard_ifeval,leaderboard_mmlu_pro,gsm8k,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,cmmlu,ceval-valid
 ```
 | Metric                       |           BF16           |           INT4           |