Update README.md
Browse files
README.md
CHANGED
@@ -34,12 +34,15 @@ language:
|
|
34 |
- ru
|
35 |
|
36 |
---
|
37 |
-
#
|
|
|
38 |
|
39 |

|
40 |
|
41 |
This is an interesting merge of **18 cool models**, created using [mergekit](https://github.com/arcee-ai/mergekit).
|
42 |
|
|
|
|
|
43 |
My thanks to the authors of the original models, your work is incredible.
|
44 |
|
45 |
Enjoy exploring :)
|
@@ -73,22 +76,108 @@ The following models were included in the merge:
|
|
73 |
|
74 |
### Configuration
|
75 |
|
76 |
-
The following
|
|
|
77 |
|
|
|
78 |
```yaml
|
79 |
models:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
80 |
|
81 |
-
|
|
|
|
|
|
|
|
|
82 |
parameters:
|
83 |
density: [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1]
|
84 |
weight: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
|
85 |
|
86 |
-
- model:
|
87 |
parameters:
|
88 |
density: [0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2]
|
89 |
weight: [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8]
|
90 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
91 |
merge_method: ties
|
92 |
-
base_model:
|
93 |
dtype: bfloat16
|
94 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
- ru
|
35 |
|
36 |
---
|
37 |
+
# DarkAtom-12B-v3
|
38 |
+
*Something that shouldn't exist*
|
39 |
|
40 |

|
41 |
|
42 |
This is an interesting merge of **18 cool models**, created using [mergekit](https://github.com/arcee-ai/mergekit).
|
43 |
|
44 |
+
It took quite a bit of my time, mostly due to the limitations of my old hardware, but I think it was definitely worth it.
|
45 |
+
|
46 |
My thanks to the authors of the original models, your work is incredible.
|
47 |
|
48 |
Enjoy exploring :)
|
|
|
76 |
|
77 |
### Configuration
|
78 |
|
79 |
+
The following YAML configurations was used to produce this model. Some parameters may have diffirent pattern, but its not important to understand my workflow.
|
80 |
+
|
81 |
|
82 |
+
Generation_1 from 18 original models:
|
83 |
```yaml
|
84 |
models:
|
85 |
+
- model: Original_Model_M
|
86 |
+
- model: Original_Model_K
|
87 |
+
merge_method: slerp
|
88 |
+
base_model: Original_Model_M
|
89 |
+
dtype: bfloat16
|
90 |
+
parameters:
|
91 |
+
t: [0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9]
|
92 |
+
|
93 |
+
```
|
94 |
+
|
95 |
|
96 |
+
Variant_N from Generation_1 and AlphaMerge:
|
97 |
+
```yaml
|
98 |
+
models:
|
99 |
+
|
100 |
+
- model: SecretModel_A
|
101 |
parameters:
|
102 |
density: [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1]
|
103 |
weight: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
|
104 |
|
105 |
+
- model: SecretModel_B
|
106 |
parameters:
|
107 |
density: [0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2]
|
108 |
weight: [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8]
|
109 |
|
110 |
+
- model: SecretModel_C
|
111 |
+
parameters:
|
112 |
+
density: [0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3]
|
113 |
+
weight: [0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7]
|
114 |
+
|
115 |
+
- model: SecretModel_D
|
116 |
+
parameters:
|
117 |
+
density: [0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4]
|
118 |
+
weight: [0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6]
|
119 |
+
|
120 |
+
- model: SecretModel_E
|
121 |
+
parameters:
|
122 |
+
density: [0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5]
|
123 |
+
weight: [0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5]
|
124 |
+
|
125 |
+
- model: SecretModel_F
|
126 |
+
parameters:
|
127 |
+
density: [0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
|
128 |
+
weight: [0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4]
|
129 |
+
|
130 |
+
- model: SecretModel_G
|
131 |
+
parameters:
|
132 |
+
density: [0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7]
|
133 |
+
weight: [0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3]
|
134 |
+
|
135 |
+
- model: SecretModel_H
|
136 |
+
parameters:
|
137 |
+
density: [0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8]
|
138 |
+
weight: [0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2]
|
139 |
+
|
140 |
+
- model: SecretModel_I
|
141 |
+
parameters:
|
142 |
+
density: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
|
143 |
+
weight: [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1]
|
144 |
+
|
145 |
merge_method: ties
|
146 |
+
base_model: AlphaMerge
|
147 |
dtype: bfloat16
|
148 |
```
|
149 |
+
|
150 |
+
Model stock merge for create:
|
151 |
+
+ Generation_2 from SecretModels
|
152 |
+
+ Variant_M from Generation_2
|
153 |
+
+ AlphaMerge from intuitively selected and forgotten models
|
154 |
+
```yaml
|
155 |
+
models:
|
156 |
+
- model: SecretModel_A
|
157 |
+
- model: SecretModel_B
|
158 |
+
- model: SecretModel_C
|
159 |
+
merge_method: model_stock
|
160 |
+
base_model: SecretModel_A
|
161 |
+
dtype: bfloat16
|
162 |
+
```
|
163 |
+
|
164 |
+
Final Variant from Variant_N, Variant_M, and one good model from Generation_1:
|
165 |
+
```yaml
|
166 |
+
models:
|
167 |
+
|
168 |
+
- model: Variant_N
|
169 |
+
parameters:
|
170 |
+
density: [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1]
|
171 |
+
weight: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
|
172 |
+
|
173 |
+
- model: Good_G1_Model
|
174 |
+
parameters:
|
175 |
+
density: [0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2]
|
176 |
+
weight: [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8]
|
177 |
+
|
178 |
+
merge_method: ties
|
179 |
+
base_model: Variant_M
|
180 |
+
dtype: bfloat16
|
181 |
+
```
|
182 |
+
|
183 |
+
Have a good time 🖤
|