Hjgugugjhuhjggg commited on
Commit
1f1d69a
·
verified ·
1 Parent(s): 2fa11b3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +196 -890
README.md CHANGED
@@ -1,893 +1,199 @@
1
  ---
2
- base_model:
3
- - Hjgugugjhuhjggg/mergekit-ties-qgcitfu
4
- - ValiantLabs/Llama3.2-3B-ShiningValiant2
5
- - CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct
6
- - Atharva26/llama-3.2-3b-mathdaily-chatbot
7
- - bunnycore/Llama-3.2-3B-ProdigyPlusPlus
8
- - disi-unibo-nlp/llama3.2-3B-SFT-medqa-triples-cot
9
- - Hjgugugjhuhjggg/mergekit-ties-poovzrh
10
- - bunnycore/Llama-3.2-3B-Long-Think
11
- - noaebbot/llama3.2-3B-insights
12
- - ValiantLabs/Llama3.2-3B-Enigma
13
- - huihui-ai/Llama-3.2-3B-Instruct-abliterated
14
- - meta-llama/Llama-3.2-3B-Instruct
15
- - Hjgugugjhuhjggg/mergekit-ties-pghuyfi
16
- - Diluksha/Llama_3.2_3B_sql_finetuned_full
17
- - bunnycore/Llama-3.2-3B-Mix
18
- - Hjgugugjhuhjggg/mergekit-ties-xflmond
19
- - bunnycore/Llama-3.2-3B-Pure-RP
20
- - chuanli11/Llama-3.2-3B-Instruct-uncensored
21
- - EmTpro01/llama-3.2-Code-Generator
22
- - bunnycore/Llama-3.2-3B-Booval
23
- - bunnycore/Llama-3.2-3B-Prodigy
24
- - BrainWave-ML/llama3.2-3B-codemath-orpo
25
- - bunnycore/Llama-3.2-3B-TitanFusion
26
- - bunnycore/Llama-3.2-3B-CodeReactor
27
- - Hjgugugjhuhjggg/mergekit-ties-kmlzhzo
28
- - Hjgugugjhuhjggg/mergekit-ties-esawwda
29
- - bunnycore/Llama-3.2-3B-TitanFusion-v2
30
- - disi-unibo-nlp/llama3.2-3B-SFT-medmcqa-triples-cot
31
- - bunnycore/Llama-3.2-3B-Mix-Skill
32
- - bunnycore/Llama-3.2-3B-Sci-Think
33
- - AELLM/Llama-3.2-Chibi-3B
34
- - AcademieDuNumerique/Llama-3.2-3B-SQL-Instruct
35
- - roger33303/Best_Model-llama3.2-3b-Instruct-Finetune-website-QnA
36
- - Hjgugugjhuhjggg/mergekit-ties-dkhnzcn
37
- - Isotonic/reasoning-llama3.2-3b
38
- - meta-llama/Llama-3.2-3B
39
- - bunnycore/Llama-3.2-3B-Apex
40
- - TroyDoesAI/BlackSheep-Llama3.2-3B-Context_Obedient
41
- - CK0607/llama3.2-3B-CodeP
42
- - bunnycore/Llama-3.2-3B-Stock
43
  library_name: transformers
44
- tags:
45
- - mergekit
46
- - merge
47
-
48
  ---
49
- # merge
50
-
51
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
52
-
53
- ## Merge Details
54
- ### Merge Method
55
-
56
- This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method using [huihui-ai/Llama-3.2-3B-Instruct-abliterated](https://huggingface.co/huihui-ai/Llama-3.2-3B-Instruct-abliterated) as a base.
57
-
58
- ### Models Merged
59
-
60
- The following models were included in the merge:
61
- * [Hjgugugjhuhjggg/mergekit-ties-qgcitfu](https://huggingface.co/Hjgugugjhuhjggg/mergekit-ties-qgcitfu)
62
- * [ValiantLabs/Llama3.2-3B-ShiningValiant2](https://huggingface.co/ValiantLabs/Llama3.2-3B-ShiningValiant2)
63
- * [CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct](https://huggingface.co/CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct)
64
- * [Atharva26/llama-3.2-3b-mathdaily-chatbot](https://huggingface.co/Atharva26/llama-3.2-3b-mathdaily-chatbot)
65
- * [bunnycore/Llama-3.2-3B-ProdigyPlusPlus](https://huggingface.co/bunnycore/Llama-3.2-3B-ProdigyPlusPlus)
66
- * [disi-unibo-nlp/llama3.2-3B-SFT-medqa-triples-cot](https://huggingface.co/disi-unibo-nlp/llama3.2-3B-SFT-medqa-triples-cot)
67
- * [Hjgugugjhuhjggg/mergekit-ties-poovzrh](https://huggingface.co/Hjgugugjhuhjggg/mergekit-ties-poovzrh)
68
- * [bunnycore/Llama-3.2-3B-Long-Think](https://huggingface.co/bunnycore/Llama-3.2-3B-Long-Think)
69
- * [noaebbot/llama3.2-3B-insights](https://huggingface.co/noaebbot/llama3.2-3B-insights)
70
- * [ValiantLabs/Llama3.2-3B-Enigma](https://huggingface.co/ValiantLabs/Llama3.2-3B-Enigma)
71
- * [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)
72
- * [Hjgugugjhuhjggg/mergekit-ties-pghuyfi](https://huggingface.co/Hjgugugjhuhjggg/mergekit-ties-pghuyfi)
73
- * [Diluksha/Llama_3.2_3B_sql_finetuned_full](https://huggingface.co/Diluksha/Llama_3.2_3B_sql_finetuned_full)
74
- * [bunnycore/Llama-3.2-3B-Mix](https://huggingface.co/bunnycore/Llama-3.2-3B-Mix)
75
- * [Hjgugugjhuhjggg/mergekit-ties-xflmond](https://huggingface.co/Hjgugugjhuhjggg/mergekit-ties-xflmond)
76
- * [bunnycore/Llama-3.2-3B-Pure-RP](https://huggingface.co/bunnycore/Llama-3.2-3B-Pure-RP)
77
- * [chuanli11/Llama-3.2-3B-Instruct-uncensored](https://huggingface.co/chuanli11/Llama-3.2-3B-Instruct-uncensored)
78
- * [EmTpro01/llama-3.2-Code-Generator](https://huggingface.co/EmTpro01/llama-3.2-Code-Generator)
79
- * [bunnycore/Llama-3.2-3B-Booval](https://huggingface.co/bunnycore/Llama-3.2-3B-Booval)
80
- * [bunnycore/Llama-3.2-3B-Prodigy](https://huggingface.co/bunnycore/Llama-3.2-3B-Prodigy)
81
- * [BrainWave-ML/llama3.2-3B-codemath-orpo](https://huggingface.co/BrainWave-ML/llama3.2-3B-codemath-orpo)
82
- * [bunnycore/Llama-3.2-3B-TitanFusion](https://huggingface.co/bunnycore/Llama-3.2-3B-TitanFusion)
83
- * [bunnycore/Llama-3.2-3B-CodeReactor](https://huggingface.co/bunnycore/Llama-3.2-3B-CodeReactor)
84
- * [Hjgugugjhuhjggg/mergekit-ties-kmlzhzo](https://huggingface.co/Hjgugugjhuhjggg/mergekit-ties-kmlzhzo)
85
- * [Hjgugugjhuhjggg/mergekit-ties-esawwda](https://huggingface.co/Hjgugugjhuhjggg/mergekit-ties-esawwda)
86
- * [bunnycore/Llama-3.2-3B-TitanFusion-v2](https://huggingface.co/bunnycore/Llama-3.2-3B-TitanFusion-v2)
87
- * [disi-unibo-nlp/llama3.2-3B-SFT-medmcqa-triples-cot](https://huggingface.co/disi-unibo-nlp/llama3.2-3B-SFT-medmcqa-triples-cot)
88
- * [bunnycore/Llama-3.2-3B-Mix-Skill](https://huggingface.co/bunnycore/Llama-3.2-3B-Mix-Skill)
89
- * [bunnycore/Llama-3.2-3B-Sci-Think](https://huggingface.co/bunnycore/Llama-3.2-3B-Sci-Think)
90
- * [AELLM/Llama-3.2-Chibi-3B](https://huggingface.co/AELLM/Llama-3.2-Chibi-3B)
91
- * [AcademieDuNumerique/Llama-3.2-3B-SQL-Instruct](https://huggingface.co/AcademieDuNumerique/Llama-3.2-3B-SQL-Instruct)
92
- * [roger33303/Best_Model-llama3.2-3b-Instruct-Finetune-website-QnA](https://huggingface.co/roger33303/Best_Model-llama3.2-3b-Instruct-Finetune-website-QnA)
93
- * [Hjgugugjhuhjggg/mergekit-ties-dkhnzcn](https://huggingface.co/Hjgugugjhuhjggg/mergekit-ties-dkhnzcn)
94
- * [Isotonic/reasoning-llama3.2-3b](https://huggingface.co/Isotonic/reasoning-llama3.2-3b)
95
- * [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B)
96
- * [bunnycore/Llama-3.2-3B-Apex](https://huggingface.co/bunnycore/Llama-3.2-3B-Apex)
97
- * [TroyDoesAI/BlackSheep-Llama3.2-3B-Context_Obedient](https://huggingface.co/TroyDoesAI/BlackSheep-Llama3.2-3B-Context_Obedient)
98
- * [CK0607/llama3.2-3B-CodeP](https://huggingface.co/CK0607/llama3.2-3B-CodeP)
99
- * [bunnycore/Llama-3.2-3B-Stock](https://huggingface.co/bunnycore/Llama-3.2-3B-Stock)
100
-
101
- ### Configuration
102
-
103
- The following YAML configuration was used to produce this model:
104
-
105
- ```yaml
106
- models:
107
- - layer_range: [0, 28]
108
- model: Hjgugugjhuhjggg/mergekit-ties-qgcitfu
109
- parameters:
110
- weight: 1
111
- density: 0.9
112
- gamma: 0.01
113
- normalize: true
114
- int8_mask: true
115
- random_seed: 0
116
- temperature: 0.5
117
- top_p: 0.65
118
- inference: true
119
- max_tokens: 999999999
120
- stream: true
121
- quantization:
122
- - method: int8
123
- value: 100
124
- - method: int4
125
- value: 100
126
-
127
- - layer_range: [0, 28]
128
- model: Hjgugugjhuhjggg/mergekit-ties-esawwda
129
- parameters:
130
- weight: 1
131
- density: 0.9
132
- gamma: 0.01
133
- normalize: true
134
- int8_mask: true
135
- random_seed: 0
136
- temperature: 0.5
137
- top_p: 0.65
138
- inference: true
139
- max_tokens: 999999999
140
- stream: true
141
- quantization:
142
- - method: int8
143
- value: 100
144
- - method: int4
145
- value: 100
146
-
147
- - layer_range: [0, 28]
148
- model: Hjgugugjhuhjggg/mergekit-ties-dkhnzcn
149
- parameters:
150
- weight: 1
151
- density: 0.9
152
- gamma: 0.01
153
- normalize: true
154
- int8_mask: true
155
- random_seed: 0
156
- temperature: 0.5
157
- top_p: 0.65
158
- inference: true
159
- max_tokens: 999999999
160
- stream: true
161
- quantization:
162
- - method: int8
163
- value: 100
164
- - method: int4
165
- value: 100
166
-
167
- - layer_range: [0, 28]
168
- model: Hjgugugjhuhjggg/mergekit-ties-poovzrh
169
- parameters:
170
- weight: 1
171
- density: 0.9
172
- gamma: 0.01
173
- normalize: true
174
- int8_mask: true
175
- random_seed: 0
176
- temperature: 0.5
177
- top_p: 0.65
178
- inference: true
179
- max_tokens: 999999999
180
- stream: true
181
- quantization:
182
- - method: int8
183
- value: 100
184
- - method: int4
185
- value: 100
186
-
187
- - layer_range: [0, 28]
188
- model: Hjgugugjhuhjggg/mergekit-ties-pghuyfi
189
- parameters:
190
- weight: 1
191
- density: 0.9
192
- gamma: 0.01
193
- normalize: true
194
- int8_mask: true
195
- random_seed: 0
196
- temperature: 0.5
197
- top_p: 0.65
198
- inference: true
199
- max_tokens: 999999999
200
- stream: true
201
- quantization:
202
- - method: int8
203
- value: 100
204
- - method: int4
205
- value: 100
206
-
207
- - layer_range: [0, 28]
208
- model: Hjgugugjhuhjggg/mergekit-ties-kmlzhzo
209
- parameters:
210
- weight: 1
211
- density: 0.9
212
- gamma: 0.01
213
- normalize: true
214
- int8_mask: true
215
- random_seed: 0
216
- temperature: 0.5
217
- top_p: 0.65
218
- inference: true
219
- max_tokens: 999999999
220
- stream: true
221
- quantization:
222
- - method: int8
223
- value: 100
224
- - method: int4
225
- value: 100
226
-
227
- - layer_range: [0, 28]
228
- model: Hjgugugjhuhjggg/mergekit-ties-xflmond
229
- parameters:
230
- weight: 1
231
- density: 0.9
232
- gamma: 0.01
233
- normalize: true
234
- int8_mask: true
235
- random_seed: 0
236
- temperature: 0.5
237
- top_p: 0.65
238
- inference: true
239
- max_tokens: 999999999
240
- stream: true
241
- quantization:
242
- - method: int8
243
- value: 100
244
- - method: int4
245
- value: 100
246
-
247
- - layer_range: [0, 28]
248
- model: bunnycore/Llama-3.2-3B-Long-Think
249
- parameters:
250
- weight: 0.5
251
- density: 0.5
252
- gamma: 0.01
253
- normalize: true
254
- int8_mask: true
255
- random_seed: 0
256
- temperature: 0.5
257
- top_p: 0.65
258
- inference: true
259
- max_tokens: 999999999
260
- stream: true
261
- quantization:
262
- - method: int8
263
- value: 100
264
- - method: int4
265
- value: 100
266
-
267
- - layer_range: [0, 28]
268
- model: bunnycore/Llama-3.2-3B-Pure-RP
269
- parameters:
270
- weight: 0.5
271
- density: 0.5
272
- gamma: 0.01
273
- normalize: true
274
- int8_mask: true
275
- random_seed: 0
276
- temperature: 0.5
277
- top_p: 0.65
278
- inference: true
279
- max_tokens: 999999999
280
- stream: true
281
- quantization:
282
- - method: int8
283
- value: 100
284
- - method: int4
285
- value: 100
286
- - layer_range: [0, 28]
287
- model: bunnycore/Llama-3.2-3B-Apex
288
- parameters:
289
- weight: 0.5
290
- density: 0.5
291
- gamma: 0.01
292
- normalize: true
293
- int8_mask: true
294
- random_seed: 0
295
- temperature: 0.5
296
- top_p: 0.65
297
- inference: true
298
- max_tokens: 999999999
299
- stream: true
300
- quantization:
301
- - method: int8
302
- value: 100
303
- - method: int4
304
- value: 100
305
- - layer_range: [0, 28]
306
- model: bunnycore/Llama-3.2-3B-Mix-Skill
307
- parameters:
308
- weight: 0.5
309
- density: 0.5
310
- gamma: 0.01
311
- normalize: true
312
- int8_mask: true
313
- random_seed: 0
314
- temperature: 0.5
315
- top_p: 0.65
316
- inference: true
317
- max_tokens: 999999999
318
- stream: true
319
- quantization:
320
- - method: int8
321
- value: 100
322
- - method: int4
323
- value: 100
324
- - layer_range: [0, 28]
325
- model: bunnycore/Llama-3.2-3B-Booval
326
- parameters:
327
- weight: 0.5
328
- density: 0.5
329
- gamma: 0.01
330
- normalize: true
331
- int8_mask: true
332
- random_seed: 0
333
- temperature: 0.5
334
- top_p: 0.65
335
- inference: true
336
- max_tokens: 999999999
337
- stream: true
338
- quantization:
339
- - method: int8
340
- value: 100
341
- - method: int4
342
- value: 100
343
- - layer_range: [0, 28]
344
- model: bunnycore/Llama-3.2-3B-ProdigyPlusPlus
345
- parameters:
346
- weight: 0.5
347
- density: 0.5
348
- gamma: 0.01
349
- normalize: true
350
- int8_mask: true
351
- random_seed: 0
352
- temperature: 0.5
353
- top_p: 0.65
354
- inference: true
355
- max_tokens: 999999999
356
- stream: true
357
- quantization:
358
- - method: int8
359
- value: 100
360
- - method: int4
361
- value: 100
362
- - layer_range: [0, 28]
363
- model: bunnycore/Llama-3.2-3B-Prodigy
364
- parameters:
365
- weight: 0.5
366
- density: 0.5
367
- gamma: 0.01
368
- normalize: true
369
- int8_mask: true
370
- random_seed: 0
371
- temperature: 0.5
372
- top_p: 0.65
373
- inference: true
374
- max_tokens: 999999999
375
- stream: true
376
- quantization:
377
- - method: int8
378
- value: 100
379
- - method: int4
380
- value: 100
381
- - layer_range: [0, 28]
382
- model: bunnycore/Llama-3.2-3B-Sci-Think
383
- parameters:
384
- weight: 0.5
385
- density: 0.5
386
- gamma: 0.01
387
- normalize: true
388
- int8_mask: true
389
- random_seed: 0
390
- temperature: 0.5
391
- top_p: 0.65
392
- inference: true
393
- max_tokens: 999999999
394
- stream: true
395
- quantization:
396
- - method: int8
397
- value: 100
398
- - method: int4
399
- value: 100
400
- - layer_range: [0, 28]
401
- model: bunnycore/Llama-3.2-3B-Stock
402
- parameters:
403
- weight: 0.5
404
- density: 0.5
405
- gamma: 0.01
406
- normalize: true
407
- int8_mask: true
408
- random_seed: 0
409
- temperature: 0.5
410
- top_p: 0.65
411
- inference: true
412
- max_tokens: 999999999
413
- stream: true
414
- quantization:
415
- - method: int8
416
- value: 100
417
- - method: int4
418
- value: 100
419
- - layer_range: [0, 28]
420
- model: chuanli11/Llama-3.2-3B-Instruct-uncensored
421
- parameters:
422
- weight: 0.5
423
- density: 0.5
424
- gamma: 0.01
425
- normalize: true
426
- int8_mask: true
427
- random_seed: 0
428
- temperature: 0.5
429
- top_p: 0.65
430
- inference: true
431
- max_tokens: 999999999
432
- stream: true
433
- quantization:
434
- - method: int8
435
- value: 100
436
- - method: int4
437
- value: 100
438
- - layer_range: [0, 28]
439
- model: ValiantLabs/Llama3.2-3B-Enigma
440
- parameters:
441
- weight: 0.5
442
- density: 0.5
443
- gamma: 0.01
444
- normalize: true
445
- int8_mask: true
446
- random_seed: 0
447
- temperature: 0.5
448
- top_p: 0.65
449
- inference: true
450
- max_tokens: 999999999
451
- stream: true
452
- quantization:
453
- - method: int8
454
- value: 100
455
- - method: int4
456
- value: 100
457
- - layer_range: [0, 28]
458
- model: CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct
459
- parameters:
460
- weight: 0.5
461
- density: 0.5
462
- gamma: 0.01
463
- normalize: true
464
- int8_mask: true
465
- random_seed: 0
466
- temperature: 0.5
467
- top_p: 0.65
468
- inference: true
469
- max_tokens: 999999999
470
- stream: true
471
- quantization:
472
- - method: int8
473
- value: 100
474
- - method: int4
475
- value: 100
476
- - layer_range: [0, 28]
477
- model: AELLM/Llama-3.2-Chibi-3B
478
- parameters:
479
- weight: 0.5
480
- density: 0.5
481
- gamma: 0.01
482
- normalize: true
483
- int8_mask: true
484
- random_seed: 0
485
- temperature: 0.5
486
- top_p: 0.65
487
- inference: true
488
- max_tokens: 999999999
489
- stream: true
490
- quantization:
491
- - method: int8
492
- value: 100
493
- - method: int4
494
- value: 100
495
- - layer_range: [0, 28]
496
- model: EmTpro01/llama-3.2-Code-Generator
497
- parameters:
498
- weight: 0.5
499
- density: 0.5
500
- gamma: 0.01
501
- normalize: true
502
- int8_mask: true
503
- random_seed: 0
504
- temperature: 0.5
505
- top_p: 0.65
506
- inference: true
507
- max_tokens: 999999999
508
- stream: true
509
- quantization:
510
- - method: int8
511
- value: 100
512
- - method: int4
513
- value: 100
514
- - layer_range: [0, 28]
515
- model: disi-unibo-nlp/llama3.2-3B-SFT-medmcqa-triples-cot
516
- parameters:
517
- weight: 0.5
518
- density: 0.5
519
- gamma: 0.01
520
- normalize: true
521
- int8_mask: true
522
- random_seed: 0
523
- temperature: 0.5
524
- top_p: 0.65
525
- inference: true
526
- max_tokens: 999999999
527
- stream: true
528
- quantization:
529
- - method: int8
530
- value: 100
531
- - method: int4
532
- value: 100
533
- - layer_range: [0, 28]
534
- model: Atharva26/llama-3.2-3b-mathdaily-chatbot
535
- parameters:
536
- weight: 0.5
537
- density: 0.5
538
- gamma: 0.01
539
- normalize: true
540
- int8_mask: true
541
- random_seed: 0
542
- temperature: 0.5
543
- top_p: 0.65
544
- inference: true
545
- max_tokens: 999999999
546
- stream: true
547
- quantization:
548
- - method: int8
549
- value: 100
550
- - method: int4
551
- value: 100
552
- - layer_range: [0, 28]
553
- model: Diluksha/Llama_3.2_3B_sql_finetuned_full
554
- parameters:
555
- weight: 0.5
556
- density: 0.5
557
- gamma: 0.01
558
- normalize: true
559
- int8_mask: true
560
- random_seed: 0
561
- temperature: 0.5
562
- top_p: 0.65
563
- inference: true
564
- max_tokens: 999999999
565
- stream: true
566
- quantization:
567
- - method: int8
568
- value: 100
569
- - method: int4
570
- value: 100
571
- - layer_range: [0, 28]
572
- model: bunnycore/Llama-3.2-3B-CodeReactor
573
- parameters:
574
- weight: 0.5
575
- density: 0.5
576
- gamma: 0.01
577
- normalize: true
578
- int8_mask: true
579
- random_seed: 0
580
- temperature: 0.5
581
- top_p: 0.65
582
- inference: true
583
- max_tokens: 999999999
584
- stream: true
585
- quantization:
586
- - method: int8
587
- value: 100
588
- - method: int4
589
- value: 100
590
- - layer_range: [0, 28]
591
- model: AcademieDuNumerique/Llama-3.2-3B-SQL-Instruct
592
- parameters:
593
- weight: 0.5
594
- density: 0.5
595
- gamma: 0.01
596
- normalize: true
597
- int8_mask: true
598
- random_seed: 0
599
- temperature: 0.5
600
- top_p: 0.65
601
- inference: true
602
- max_tokens: 999999999
603
- stream: true
604
- quantization:
605
- - method: int8
606
- value: 100
607
- - method: int4
608
- value: 100
609
- - layer_range: [0, 28]
610
- model: roger33303/Best_Model-llama3.2-3b-Instruct-Finetune-website-QnA
611
- parameters:
612
- weight: 0.5
613
- density: 0.5
614
- gamma: 0.01
615
- normalize: true
616
- int8_mask: true
617
- random_seed: 0
618
- temperature: 0.5
619
- top_p: 0.65
620
- inference: true
621
- max_tokens: 999999999
622
- stream: true
623
- quantization:
624
- - method: int8
625
- value: 100
626
- - method: int4
627
- value: 100
628
- - layer_range: [0, 28]
629
- model: noaebbot/llama3.2-3B-insights
630
- parameters:
631
- weight: 0.5
632
- density: 0.5
633
- gamma: 0.01
634
- normalize: true
635
- int8_mask: true
636
- random_seed: 0
637
- temperature: 0.5
638
- top_p: 0.65
639
- inference: true
640
- max_tokens: 999999999
641
- stream: true
642
- quantization:
643
- - method: int8
644
- value: 100
645
- - method: int4
646
- value: 100
647
- - layer_range: [0, 28]
648
- model: bunnycore/Llama-3.2-3B-TitanFusion-v2
649
- parameters:
650
- weight: 0.5
651
- density: 0.5
652
- gamma: 0.01
653
- normalize: true
654
- int8_mask: true
655
- random_seed: 0
656
- temperature: 0.5
657
- top_p: 0.65
658
- inference: true
659
- max_tokens: 999999999
660
- stream: true
661
- quantization:
662
- - method: int8
663
- value: 100
664
- - method: int4
665
- value: 100
666
- - layer_range: [0, 28]
667
- model: bunnycore/Llama-3.2-3B-TitanFusion
668
- parameters:
669
- weight: 0.5
670
- density: 0.5
671
- gamma: 0.01
672
- normalize: true
673
- int8_mask: true
674
- random_seed: 0
675
- temperature: 0.5
676
- top_p: 0.65
677
- inference: true
678
- max_tokens: 999999999
679
- stream: true
680
- quantization:
681
- - method: int8
682
- value: 100
683
- - method: int4
684
- value: 100
685
- - layer_range: [0, 28]
686
- model: bunnycore/Llama-3.2-3B-Mix
687
- parameters:
688
- weight: 0.5
689
- density: 0.5
690
- gamma: 0.01
691
- normalize: true
692
- int8_mask: true
693
- random_seed: 0
694
- temperature: 0.5
695
- top_p: 0.65
696
- inference: true
697
- max_tokens: 999999999
698
- stream: true
699
- quantization:
700
- - method: int8
701
- value: 100
702
- - method: int4
703
- value: 100
704
- - layer_range: [0, 28]
705
- model: ValiantLabs/Llama3.2-3B-ShiningValiant2
706
- parameters:
707
- weight: 0.5
708
- density: 0.5
709
- gamma: 0.01
710
- normalize: true
711
- int8_mask: true
712
- random_seed: 0
713
- temperature: 0.5
714
- top_p: 0.65
715
- inference: true
716
- max_tokens: 999999999
717
- stream: true
718
- quantization:
719
- - method: int8
720
- value: 100
721
- - method: int4
722
- value: 100
723
- - layer_range: [0, 28]
724
- model: TroyDoesAI/BlackSheep-Llama3.2-3B-Context_Obedient
725
- parameters:
726
- weight: 0.5
727
- density: 0.5
728
- gamma: 0.01
729
- normalize: true
730
- int8_mask: true
731
- random_seed: 0
732
- temperature: 0.5
733
- top_p: 0.65
734
- inference: true
735
- max_tokens: 999999999
736
- stream: true
737
- quantization:
738
- - method: int8
739
- value: 100
740
- - method: int4
741
- value: 100
742
- - layer_range: [0, 28]
743
- model: BrainWave-ML/llama3.2-3B-codemath-orpo
744
- parameters:
745
- weight: 0.5
746
- density: 0.5
747
- gamma: 0.01
748
- normalize: true
749
- int8_mask: true
750
- random_seed: 0
751
- temperature: 0.5
752
- top_p: 0.65
753
- inference: true
754
- max_tokens: 999999999
755
- stream: true
756
- quantization:
757
- - method: int8
758
- value: 100
759
- - method: int4
760
- value: 100
761
- - layer_range: [0, 28]
762
- model: CK0607/llama3.2-3B-CodeP
763
- parameters:
764
- weight: 0.5
765
- density: 0.5
766
- gamma: 0.01
767
- normalize: true
768
- int8_mask: true
769
- random_seed: 0
770
- temperature: 0.5
771
- top_p: 0.65
772
- inference: true
773
- max_tokens: 999999999
774
- stream: true
775
- quantization:
776
- - method: int8
777
- value: 100
778
- - method: int4
779
- value: 100
780
- - layer_range: [0, 28]
781
- model: disi-unibo-nlp/llama3.2-3B-SFT-medqa-triples-cot
782
- parameters:
783
- weight: 0.5
784
- density: 0.5
785
- gamma: 0.01
786
- normalize: true
787
- int8_mask: true
788
- random_seed: 0
789
- temperature: 0.5
790
- top_p: 0.65
791
- inference: true
792
- max_tokens: 999999999
793
- stream: true
794
- quantization:
795
- - method: int8
796
- value: 100
797
- - method: int4
798
- value: 100
799
- - layer_range: [0, 28]
800
- model: Isotonic/reasoning-llama3.2-3b
801
- parameters:
802
- weight: 0.5
803
- density: 0.5
804
- gamma: 0.01
805
- normalize: true
806
- int8_mask: true
807
- random_seed: 0
808
- temperature: 0.5
809
- top_p: 0.65
810
- inference: true
811
- max_tokens: 999999999
812
- stream: true
813
- quantization:
814
- - method: int8
815
- value: 100
816
- - method: int4
817
- value: 100
818
- - layer_range: [0, 28]
819
- model: meta-llama/Llama-3.2-3B-Instruct
820
- parameters:
821
- weight: 0.5
822
- density: 0.5
823
- gamma: 0.01
824
- normalize: true
825
- int8_mask: true
826
- random_seed: 0
827
- temperature: 0.5
828
- top_p: 0.65
829
- inference: true
830
- max_tokens: 999999999
831
- stream: true
832
- quantization:
833
- - method: int8
834
- value: 100
835
- - method: int4
836
- value: 100
837
- - layer_range: [0, 28]
838
- model: meta-llama/Llama-3.2-3B
839
- parameters:
840
- weight: 0.5
841
- density: 0.5
842
- gamma: 0.01
843
- normalize: true
844
- int8_mask: true
845
- random_seed: 0
846
- temperature: 0.5
847
- top_p: 0.65
848
- inference: true
849
- max_tokens: 999999999
850
- stream: true
851
- quantization:
852
- - method: int8
853
- value: 100
854
- - method: int4
855
- value: 100
856
-
857
- merge_method: linear
858
- base_model: huihui-ai/Llama-3.2-3B-Instruct-abliterated
859
- weight: 1
860
- density: 0.9
861
- gamma: 0.01
862
- normalize: true
863
- int8_mask: true
864
- random_seed: 0
865
- temperature: 0.5
866
- top_p: 0.65
867
- inference: true
868
- max_tokens: 999999999
869
- stream: true
870
- quantization:
871
- - method: int8
872
- value: 100
873
- - method: int4
874
- value: 100
875
- parameters:
876
- weight: 1
877
- density: 0.9
878
- gamma: 0.01
879
- normalize: true
880
- int8_mask: true
881
- random_seed: 0
882
- temperature: 0.5
883
- top_p: 0.65
884
- inference: true
885
- max_tokens: 999999999
886
- stream: true
887
- quantization:
888
- - method: int8
889
- value: 100
890
- - method: int4
891
- value: 100
892
- dtype: float16
893
- ```
 
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  library_name: transformers
3
+ tags: []
 
 
 
4
  ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+ This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]