Update README.md
Browse files
README.md
CHANGED
@@ -134,14 +134,16 @@ model-index:
|
|
134 |
# WestMaid_HermesMonarchv0.1
|
135 |
|
136 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/655a9883cbbaec115c3fd6b3/YJTMJZF80hKaKnPDu_yMV.png" alt="drawing" width="800"/>
|
137 |
-
|
|
|
|
|
138 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit)
|
139 |
|
140 |
## Merge Details
|
141 |
### Merge Method
|
142 |
|
143 |
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base.
|
144 |
-
Density was chosen deterministically between the models chosen for this merge. After testing many densities, I settled on 0.58 for each of the chosen models
|
145 |
|
146 |
|
147 |
|
|
|
134 |
# WestMaid_HermesMonarchv0.1
|
135 |
|
136 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/655a9883cbbaec115c3fd6b3/YJTMJZF80hKaKnPDu_yMV.png" alt="drawing" width="800"/>
|
137 |
+
|
138 |
+
This model benchmarks quite well compared to other 7b models, and has exceptional [MT-Bench](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) and [EQ-Bench v2.1](https://github.com/EQ-bench/EQ-Bench) scores, ranking higher than ChatGPT-3.5-turbo, Claude-1, Goliath-120b, and other 70B models.
|
139 |
+
|
140 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit)
|
141 |
|
142 |
## Merge Details
|
143 |
### Merge Method
|
144 |
|
145 |
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base.
|
146 |
+
Density was chosen deterministically between the models chosen for this merge. After testing many densities, I settled on 0.58 for each of the chosen models as it returned the highest EQ-Bench score. Not much testing was done with the weights, but I thought that I'd try gradients. Conceptually, Westlake and a Distilled version of Open Heremes are heavier in the initial layers, before Noromaid and AlphaMonarch come in to guide its reasoning and conversation.
|
147 |
|
148 |
|
149 |
|