README.md · Khetterman/DarkAtom-12B-v3 at main

DarkAtom-12B-v3 / README.md

Khetterman

Update README.md

7c7dacc verified 4 months ago

preview code

raw

history blame contribute delete

7.72 kB

	---
	base_model:
	- Bacon666/Phenom-12B-0.1
	- benhaotang/nemo-math-science-philosophy-12B
	- FallenMerick/MN-Chunky-Lotus-12B
	- GalrionSoftworks/Canidori-12B-v1
	- GalrionSoftworks/Pleiades-12B-v1
	- Luni/StarDust-12b-v2
	- Nohobby/InsanityB
	- Nohobby/MN-12B-Siskin-v0.2
	- ProdeusUnity/Stellar-Odyssey-12b-v0.0
	- Pyroserenus/Orthrus-12b-v0.8
	- rityak/MN-Maghin-12B
	- rityak/MN-RocinanteCelestar-12B
	- royallab/MN-LooseCannon-12B-v2
	- spow12/ChatWaifu_12B_v2.0
	- Svak/MN-12B-Inferor-v0.0
	- ThijsL202/MadMix-Unleashed-12B
	- Trappu/Abomination-merge-attempt-12B
	- VongolaChouko/Starcannon-Unleashed-12B-v1.0
	library_name: transformers
	tags:
	- mergekit
	- merge
	- bfloat16
	- safetensors
	- 12b
	- chat
	- creative
	- roleplay
	- conversational
	- creative-writing
	- not-for-all-audiences
	language:
	- en
	- ru

	---
	# DarkAtom-12B-v3

	>Something that shouldn't exist.

	![DarkAtomLogo256.png](https://cdn-uploads.huggingface.co/production/uploads/673125091920e70ac26c8a2e/mPixgwI3P4oONLCKATqmd.png)

	This is an interesting merge of 18 cool models, created using [mergekit](https://github.com/arcee-ai/mergekit).
	It took quite a bit of my time, mostly due to the limitations of my old hardware, but I think it was definitely worth it.
	Enjoy exploring :)

	## Merge Details
	### Method

	This model was merged using the multistep (Slerp\|ModelStock\|Ties) process and remerge with some model variations for best result.

	### Models

	The following models were included in the merge:
	* [Bacon666/Phenom-12B-0.1](https://huggingface.co/Bacon666/Phenom-12B-0.1)
	* [benhaotang/nemo-math-science-philosophy-12B](https://huggingface.co/benhaotang/nemo-math-science-philosophy-12B)
	* [FallenMerick/MN-Chunky-Lotus-12B](https://huggingface.co/FallenMerick/MN-Chunky-Lotus-12B)
	* [GalrionSoftworks/Canidori-12B-v1](https://huggingface.co/GalrionSoftworks/Canidori-12B-v1)
	* [GalrionSoftworks/Pleiades-12B-v1](https://huggingface.co/GalrionSoftworks/Pleiades-12B-v1)
	* [Luni/StarDust-12b-v2](https://huggingface.co/Luni/StarDust-12b-v2)
	* [Nohobby/InsanityB](https://huggingface.co/Nohobby/InsanityB)
	* [Nohobby/MN-12B-Siskin-v0.2](https://huggingface.co/Nohobby/MN-12B-Siskin-v0.2)
	* [ProdeusUnity/Stellar-Odyssey-12b-v0.0](https://huggingface.co/ProdeusUnity/Stellar-Odyssey-12b-v0.0)
	* [Pyroserenus/Orthrus-12b-v0.8](https://huggingface.co/Pyroserenus/Orthrus-12b-v0.8)
	* [rityak/MN-Maghin-12B](https://huggingface.co/rityak/MN-Maghin-12B)
	* [rityak/MN-RocinanteCelestar-12B](https://huggingface.co/rityak/MN-RocinanteCelestar-12B)
	* [royallab/MN-LooseCannon-12B-v2](https://huggingface.co/royallab/MN-LooseCannon-12B-v2)
	* [spow12/ChatWaifu_12B_v2.0](https://huggingface.co/spow12/ChatWaifu_12B_v2.0)
	* [Svak/MN-12B-Inferor-v0.0](https://huggingface.co/Svak/MN-12B-Inferor-v0.0)
	* [ThijsL202/MadMix-Unleashed-12B](https://huggingface.co/ThijsL202/MadMix-Unleashed-12B)
	* [Trappu/Abomination-merge-attempt-12B](https://huggingface.co/Trappu/Abomination-merge-attempt-12B)
	* [VongolaChouko/Starcannon-Unleashed-12B-v1.0](https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0)

	### Configuration

	The following YAML configurations was used to produce this model. Some parameters may have diffirent pattern, but its not important to understand my workflow.

	```yaml
	# Generation_1 from 18 original models:
	models:
	- model: Original_Model_M
	- model: Original_Model_K
	merge_method: slerp
	base_model: Original_Model_M
	dtype: bfloat16
	parameters:
	t: [0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.9]

	# Variant_N from Generation_1 and AlphaMerge:
	models:
	- model: SecretModel_A
	parameters:
	density: [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1]
	weight: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
	- model: SecretModel_B
	parameters:
	density: [0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2]
	weight: [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8]
	- model: SecretModel_C
	parameters:
	density: [0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3]
	weight: [0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7]
	- model: SecretModel_D
	parameters:
	density: [0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4]
	weight: [0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6]
	- model: SecretModel_E
	parameters:
	density: [0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5]
	weight: [0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5]
	- model: SecretModel_F
	parameters:
	density: [0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
	weight: [0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4]
	- model: SecretModel_G
	parameters:
	density: [0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7]
	weight: [0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3]
	- model: SecretModel_H
	parameters:
	density: [0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8]
	weight: [0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2]
	- model: SecretModel_I
	parameters:
	density: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
	weight: [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1]
	merge_method: ties
	base_model: AlphaMerge
	dtype: bfloat16

	# Model stock merge for create:
	# + Generation_2 from SecretModels
	# + Variant_M from Generation_2
	# + AlphaMerge from intuitively selected and forgotten models
	models:
	- model: SecretModel_A
	- model: SecretModel_B
	- model: SecretModel_C
	merge_method: model_stock
	base_model: SecretModel_A
	dtype: bfloat16

	# Final Variant from Variant_N, Variant_M, and one good model from Generation_1:
	models:
	- model: Variant_N
	parameters:
	density: [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1]
	weight: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
	- model: Good_G1_Model
	parameters:
	density: [0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2]
	weight: [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.8]
	merge_method: ties
	base_model: Variant_M
	dtype: bfloat16
	```

	>My thanks to the authors of the original models, your work is incredible. Have a good time 🖤