English
yintongl commited on
Commit
e27abcd
·
verified ·
1 Parent(s): 10afe5a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -16
README.md CHANGED
@@ -13,22 +13,6 @@ This model is an int4 model with group_size 128 of [THUDM/chatglm2-6b](https://h
13
 
14
 
15
 
16
- ### INT4 Inference with AutoGPTQ's Kernel
17
-
18
- ```python
19
- ##pip install auto-gptq[triton]
20
- ##pip install triton==2.2.0
21
- from transformers import AutoModel, AutoTokenizer
22
- quantized_model_dir = "Intel/chatglm2-6b-int4-inc"
23
- model = AutoModel.from_pretrained(quantized_model_dir,
24
- device_map="auto",
25
- trust_remote_code=False,
26
- )
27
-
28
- tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=True)
29
- print(tokenizer.decode(model.generate(**tokenizer("There is a girl who likes adventure,", return_tensors="pt").to(model.device),max_new_tokens=50)[0]))
30
- ```
31
-
32
 
33
 
34
  ### Evaluate the model
 
13
 
14
 
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
 
18
  ### Evaluate the model