RichardErkhov commited on
Commit
db9d68b
ยท
verified ยท
1 Parent(s): e041652

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +118 -0
README.md ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ Qwen2-7B-Instruct-Refine - GGUF
11
+ - Model creator: https://huggingface.co/alibaba-pai/
12
+ - Original model: https://huggingface.co/alibaba-pai/Qwen2-7B-Instruct-Refine/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [Qwen2-7B-Instruct-Refine.Q2_K.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q2_K.gguf) | Q2_K | 2.81GB |
18
+ | [Qwen2-7B-Instruct-Refine.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q3_K_S.gguf) | Q3_K_S | 3.25GB |
19
+ | [Qwen2-7B-Instruct-Refine.Q3_K.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q3_K.gguf) | Q3_K | 3.55GB |
20
+ | [Qwen2-7B-Instruct-Refine.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q3_K_M.gguf) | Q3_K_M | 3.55GB |
21
+ | [Qwen2-7B-Instruct-Refine.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q3_K_L.gguf) | Q3_K_L | 3.81GB |
22
+ | [Qwen2-7B-Instruct-Refine.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.IQ4_XS.gguf) | IQ4_XS | 3.96GB |
23
+ | [Qwen2-7B-Instruct-Refine.Q4_0.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q4_0.gguf) | Q4_0 | 4.13GB |
24
+ | [Qwen2-7B-Instruct-Refine.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.IQ4_NL.gguf) | IQ4_NL | 4.16GB |
25
+ | [Qwen2-7B-Instruct-Refine.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q4_K_S.gguf) | Q4_K_S | 4.15GB |
26
+ | [Qwen2-7B-Instruct-Refine.Q4_K.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q4_K.gguf) | Q4_K | 4.36GB |
27
+ | [Qwen2-7B-Instruct-Refine.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q4_K_M.gguf) | Q4_K_M | 4.36GB |
28
+ | [Qwen2-7B-Instruct-Refine.Q4_1.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q4_1.gguf) | Q4_1 | 4.54GB |
29
+ | [Qwen2-7B-Instruct-Refine.Q5_0.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q5_0.gguf) | Q5_0 | 4.95GB |
30
+ | [Qwen2-7B-Instruct-Refine.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q5_K_S.gguf) | Q5_K_S | 4.95GB |
31
+ | [Qwen2-7B-Instruct-Refine.Q5_K.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q5_K.gguf) | Q5_K | 5.07GB |
32
+ | [Qwen2-7B-Instruct-Refine.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q5_K_M.gguf) | Q5_K_M | 5.07GB |
33
+ | [Qwen2-7B-Instruct-Refine.Q5_1.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q5_1.gguf) | Q5_1 | 5.36GB |
34
+ | [Qwen2-7B-Instruct-Refine.Q6_K.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q6_K.gguf) | Q6_K | 5.82GB |
35
+ | [Qwen2-7B-Instruct-Refine.Q8_0.gguf](https://huggingface.co/RichardErkhov/alibaba-pai_-_Qwen2-7B-Instruct-Refine-gguf/blob/main/Qwen2-7B-Instruct-Refine.Q8_0.gguf) | Q8_0 | 7.54GB |
36
+
37
+
38
+
39
+
40
+ Original model description:
41
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/62aba5ebab9ed4f63c36b1e2/47PZcc9QTR_okQIvKeOLn.png" alt="image/png" style="transform: scale(1);">
42
+
43
+
44
+ ## ๐Ÿ“– Introduction
45
+
46
+ **Qwen2-7B-Instruct-Refine** and **Qwen2-1.5B-Instruct-Refine** are two powerful large language models that act as proficient prompt engineers. They can optimize and refine the prompts input by users, and the generated optimized instructions can significantly enhance the LLM's ability to produce better and more informative responses for users.
47
+
48
+ We fine-tuned **Qwen2-7B-Instruct** and **Qwen2-1.5B-Instruct** to obtain **Qwen2-7B-Instruct-Refine** and **Qwen2-1.5B-Instruct-Refine**.
49
+ We sampled the dataset from OpenHermes and the LCCD dataset, ensuring a balanced task distribution. For training set annotations, we used Qwen-max with incorporated our handwritten examples as in-context prompts.
50
+
51
+ ## ๐Ÿš€ Quick Start
52
+
53
+ Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
54
+
55
+ ```python
56
+ from transformers import AutoModelForCausalLM, AutoTokenizer
57
+ device = "cuda" # the device to load the model onto
58
+
59
+ model = AutoModelForCausalLM.from_pretrained(
60
+ "alibaba-pai/Qwen2-1.5B-Instruct-Refine",
61
+ torch_dtype="auto",
62
+ device_map="auto"
63
+ )
64
+ tokenizer = AutoTokenizer.from_pretrained("alibaba-pai/Qwen2-1.5B-Instruct-Refine")
65
+
66
+ prompt = "Give me a short introduction to large language model."
67
+ messages = [
68
+ {"role": "user", "content": prompt}
69
+ ]
70
+ text = tokenizer.apply_chat_template(
71
+ messages,
72
+ tokenize=False,
73
+ add_generation_prompt=True
74
+ )
75
+ model_inputs = tokenizer([text], return_tensors="pt").to(device)
76
+
77
+ generated_ids = model.generate(
78
+ model_inputs.input_ids,
79
+ max_new_tokens=2048๏ผŒ
80
+ eos_token_id=151645๏ผŒ
81
+ )
82
+ generated_ids = [
83
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
84
+ ]
85
+
86
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
87
+ ```
88
+
89
+ ## ๐Ÿ” Evaluation
90
+
91
+ We used single-turn instructions from MT-Bench as input for Qwen2-1.5B-Instruct and Qwen2-7B-Instruct. GPT4-turbo is used to evaluate the changes in the level of detail and truthfulness of responses to our model's revised instructions.
92
+
93
+ | Model | Detail | Truthfulness |
94
+ |:----------------------------:|:------:|:------------:|
95
+ | Qwen2-1.5B-Instruct | 50.00% | 50.00% |
96
+ | + Qwen2-1.5B-Instruct-Refine | 75.63% | 63.75% |
97
+ | + Qwen2-7B-Instruct-Refine | 76.56% | 62.19% |
98
+ | Qwen2-7B-Instruct | 50.00% | 50.00% |
99
+ | + Qwen2-1.5B-Instruct-Refine | 70.94% | 57.19% |
100
+ | + Qwen2-7B-Instruct-Refine | 74.69% | 58.44% |
101
+
102
+
103
+ ## ๐Ÿ“œ Citation
104
+
105
+ If you find our work helpful, please cite it!
106
+
107
+ ```
108
+ @misc{TAPIR,
109
+ title={Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning},
110
+ author={Yuanhao Yue and Chengyu Wang and Jun Huang and Peng Wang},
111
+ year={2024},
112
+ eprint={2405.13448},
113
+ archivePrefix={arXiv},
114
+ primaryClass={cs.CL},
115
+ url={https://arxiv.org/abs/2405.13448},
116
+ }
117
+ ```
118
+