stardust-eques commited on
Commit
72046c3
·
verified ·
1 Parent(s): b6e1c5e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +189 -3
README.md CHANGED
@@ -1,3 +1,189 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - EQUES/TinySwallow-Stratos-1.5B
5
+ ---
6
+
7
+ # TinyQwens-Merge-1.5B
8
+
9
+ 本モデルは, 現在公開されている1.5Bパラメタの小型LLMのうち, Qwenをベースとしているものを統合したマージモデルです.
10
+ マージ手法にはSakanaAI社が開発した[進化的モデルマージ](https://sakana.ai/evolutionary-model-merge-jp/)を採用しています.
11
+
12
+
13
+ Genome models:
14
+ - [SakanaAI/TinySwallow-1.5B-Instruct](https://huggingface.co/SakanaAI/TinySwallow-1.5B-Instruct)
15
+ - [EQUES/TinySwallow-Stratos-1.5B](https://huggingface.co/EQUES/TinySwallow-Stratos-1.5B)
16
+ - [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)
17
+ - [Qwen/Qwen2.5-1.5B-Instruc](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)
18
+
19
+ Base model:
20
+ - [EQUES/TinySwallow-Stratos-1.5B](https://huggingface.co/EQUES/TinySwallow-Stratos-1.5B)
21
+
22
+
23
+ ### Uses
24
+ Uses follow the original models.
25
+ This model is provided for research and development purposes only and should be considered as an experimental prototype. It is not intended for commercial use or deployment in mission-critical environments. Use of this model is at the user's own risk, and its performance and outcomes are not guaranteed. EQUES Inc. shall not be liable for any direct, indirect, special, incidental, or consequential damages, or any loss arising from the use of this model, regardless of the results obtained. Users must fully understand the risks associated with the use of this model and use it at their own discretion.
26
+
27
+
28
+ ### Output Examples
29
+
30
+ <details><summary>Give me a short introduction to large language model.</summary>
31
+
32
+ ```
33
+ Large language models are powerful AI systems that can generate human-like text, code, and even code snippets. They are used in a wide range of applications, including:
34
+
35
+ Chatbots: Large language models can be used to create conversational agents that can interact with users in a natural and engaging way.
36
+
37
+ Code Generation: These models can generate code in a way that is both readable and functional.
38
+
39
+ Research: They can be used to generate hypotheses and explore new ideas.
40
+
41
+ Education: They can be used to create interactive learning experiences.
42
+
43
+ Marketing: They can be used to create compelling marketing materials.
44
+
45
+ Product Development: They can be used to generate product descriptions and marketing materials.
46
+
47
+ Customer Service: They can be used to provide personalized support to customers.
48
+
49
+ Content Creation: They can be used to generate creative content for various purposes.
50
+
51
+ Research: They can be used to explore new ideas and hypotheses.
52
+
53
+ Education: They can be used to create interactive learning experiences.
54
+
55
+ Product Development: They can be used to generate product descriptions and marketing materials.
56
+
57
+ Marketing: They can be used to create compelling marketing materials.
58
+
59
+ Product Development: They can be used to generate product descriptions and marketing materials.
60
+
61
+ Research: They can be used to explore new ideas and hypotheses.
62
+
63
+ Education: They can be used to create interactive learning experiences.
64
+
65
+ Customer Service: They can be used to provide personalized support to customers.
66
+
67
+ Content Creation: They can be used to generate creative content for various purposes.
68
+
69
+ Product Development: They can be used to generate product descriptions and marketing materials.
70
+
71
+ Research: They can be used to explore new ideas and hypotheses.
72
+
73
+ Education: They can be used to create interactive learning experiences.
74
+
75
+ Marketing: They can be used to create compelling marketing materials.
76
+
77
+ Product Development: They can be used to generate product descriptions and marketing materials.
78
+
79
+ Research: They can be used to explore new ideas and hypotheses.
80
+
81
+ Education: They can be used to create interactive learning experiences.
82
+
83
+ Customer Service: They can be used to provide personalized support to customers.
84
+
85
+ Content Creation: They can be used to generate creative content for various purposes.
86
+
87
+ Product Development: They can be used to generate product descriptions and marketing materials.
88
+
89
+ Research: They can be used to explore new ideas and hypotheses.
90
+
91
+ Education: They can be used to create interactive learning experiences.
92
+
93
+ Marketing: They can be used to create compelling marketing materials
94
+ ```
95
+
96
+ </details>
97
+
98
+
99
+ <details><summary>大規模言語モデルについて教えて。</summary>
100
+
101
+ ```
102
+ 大規模言語モデルは、大量のデータを学習し、それらを基に高度な言語理解や生成能力を提供するAIシステムです。これらのモデルは、特定のタスクや問題に対する特定の能力を提供します。大規模言語モデルは、特定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。
103
+
104
+ 大規模言語モデルは、��定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。
105
+
106
+ 大規模言語モデルは、特定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。
107
+
108
+ 大規模言語モデルは、特定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。
109
+
110
+ 大規模言語モデルは、特定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。
111
+
112
+ 大規模言語モデルは、特定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。
113
+
114
+ 大規模言語モデルは、特定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。
115
+
116
+ 大規模言語モデルは、特定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。
117
+
118
+ 大規模言語モデルは、特定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。
119
+
120
+ 大規模言語モデルは、特定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。
121
+
122
+ 大規模言語モデルは、特定のタスクや問題に対する特定の能力を提供します。それらは、特定のタスクや問題に対する特定の能力を提供します。
123
+
124
+ 大規模言語モデルは、特定のタスクや
125
+ ```
126
+
127
+ </details>
128
+
129
+ <details><summary>A regular hexagon can be divided into six equilateral triangles. If the perimeter of one of the triangles is 21 inches, what is the perimeter, in inches, of the regular hexagon?</summary>
130
+
131
+ ```
132
+ The perimeter of each equilateral triangle is 21 inches. Since a regular hexagon is composed of six equilateral triangles, the perimeter of the hexagon would be 6 times the perimeter of one triangle. Therefore, the perimeter of the hexagon would be 6 * 21 inches, which is 126 inches.
133
+
134
+ So, the perimeter of the regular hexagon is 126 inches.
135
+
136
+ Let me know if I can help with any other questions you might have. I'm here to assist with any math-related queries. If you have any other questions about geometry, algebra, or anything else, feel free to ask. I'm always ready to help!
137
+ ```
138
+
139
+ </details>
140
+
141
+ Note that the model gave a wrong answer in the final example.
142
+
143
+
144
+
145
+ ### Sample Usage
146
+ ```python
147
+ import os
148
+ os.environ["CUDA_VISIBLE_DEVICES"]="0"
149
+ from transformers import AutoModelForCausalLM, AutoTokenizer
150
+
151
+ model_name = "EQUES/TinyQwens-Merge-1.5B"
152
+ model = AutoModelForCausalLM.from_pretrained(
153
+ model_name,
154
+ device_map="auto"
155
+ )
156
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
157
+
158
+ prompt = "大規模言語モデルについて教えて。"
159
+ messages = [
160
+ {"role": "system", "content": "You are a helpful assistant."},
161
+ {"role": "user", "content": prompt}
162
+ ]
163
+ text = tokenizer.apply_chat_template(
164
+ messages,
165
+ tokenize=False,
166
+ add_generation_prompt=True
167
+ )
168
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
169
+
170
+ generated_ids = model.generate(
171
+ **model_inputs,
172
+ max_new_tokens=512,
173
+ )
174
+ generated_ids = [
175
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
176
+ ]
177
+
178
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
179
+ print(response)
180
+ ```
181
+
182
+ ### License
183
+ Apache-2.0
184
+
185
+
186
+ ### Acknowledgement
187
+ - SakanaAI & Swallow team : development and release of TinySwallow-1.5B
188
+ - SakanaAI
189
+ - Mergekit