YashJain commited on
Commit
dbe97d4
·
verified ·
1 Parent(s): b629f8a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -11
README.md CHANGED
@@ -4,23 +4,52 @@ language:
4
  license: apache-2.0
5
  tags:
6
  - chat
7
- - mlx
8
  pipeline_tag: text-generation
 
 
 
9
  ---
10
 
11
  # YashJain/GitAI-Qwen2-0.5B-Instruct
12
 
13
- The Model [YashJain/GitAI-Qwen2-0.5B-Instruct](https://huggingface.co/YashJain/GitAI-Qwen2-0.5B-Instruct) was converted to MLX format from [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) using mlx-lm version **0.15.0**.
14
 
15
- ## Use with mlx
16
-
17
- ```bash
18
- pip install mlx-lm
19
  ```
20
-
 
 
21
  ```python
22
- from mlx_lm import load, generate
 
23
 
24
- model, tokenizer = load("YashJain/GitAI-Qwen2-0.5B-Instruct")
25
- response = generate(model, tokenizer, prompt="hello", verbose=True)
26
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  license: apache-2.0
5
  tags:
6
  - chat
7
+ - code
8
  pipeline_tag: text-generation
9
+ datasets:
10
+ - YashJain/GitAI
11
+ library_name: transformers
12
  ---
13
 
14
  # YashJain/GitAI-Qwen2-0.5B-Instruct
15
 
 
16
 
17
+ ## Requirements
18
+ The code of Qwen2 has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`, or you might encounter the following error:
 
 
19
  ```
20
+ KeyError: 'qwen2'
21
+ ```
22
+ ## Quickstart
23
  ```python
24
+ from transformers import AutoModelForCausalLM, AutoTokenizer
25
+ device = "cuda" # the device to load the model onto
26
 
27
+ model = AutoModelForCausalLM.from_pretrained(
28
+ "Qwen/Qwen2-0.5B-Instruct",
29
+ torch_dtype="auto",
30
+ device_map="auto"
31
+ )
32
+ tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-0.5B-Instruct")
33
+
34
+ prompt = "Give me a short introduction to large language model."
35
+ messages = [
36
+ {"role": "system", "content": "You are a helpful assistant."},
37
+ {"role": "user", "content": prompt}
38
+ ]
39
+ text = tokenizer.apply_chat_template(
40
+ messages,
41
+ tokenize=False,
42
+ add_generation_prompt=True
43
+ )
44
+ model_inputs = tokenizer([text], return_tensors="pt").to(device)
45
+
46
+ generated_ids = model.generate(
47
+ model_inputs.input_ids,
48
+ max_new_tokens=512
49
+ )
50
+ generated_ids = [
51
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
52
+ ]
53
+
54
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
55
+ ```