giraffe176
/

WestMaid_HermesMonarchv0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

giraffe176 commited on Mar 4, 2024

Commit

e0a7177

·

verified ·

1 Parent(s): be309f9

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -134,14 +134,16 @@ model-index:
 # WestMaid_HermesMonarchv0.1
 <img src="https://cdn-uploads.huggingface.co/production/uploads/655a9883cbbaec115c3fd6b3/YJTMJZF80hKaKnPDu_yMV.png" alt="drawing" width="800"/>
-This model benchmarks quite well compared to other 7b models, and has exceptional MT-Bench and EQ-Bench v2.1 scores, ranking higher than ChatGPT-3.5-turbo and Claude-1.
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit)
 ## Merge Details
 ### Merge Method
 This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base.
-Density was chosen deterministically between the models chosen for this merge. After testing many densities, I settled on 0.58 for each of the chosen models, as it returned the highest EQ-Bench score. Not much testing was done with the weights, but I thought that I'd try gradients. Conceptually, Westlake and a Distilled version of Open Heremes are heavier in the initial layers, before Noromaid and AlphaMonarch come in to guide its reasoning and conversation.

 # WestMaid_HermesMonarchv0.1
 <img src="https://cdn-uploads.huggingface.co/production/uploads/655a9883cbbaec115c3fd6b3/YJTMJZF80hKaKnPDu_yMV.png" alt="drawing" width="800"/>
+This model benchmarks quite well compared to other 7b models, and has exceptional [MT-Bench](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) and [EQ-Bench v2.1](https://github.com/EQ-bench/EQ-Bench) scores, ranking higher than ChatGPT-3.5-turbo, Claude-1, Goliath-120b, and other 70B models.
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit)
 ## Merge Details
 ### Merge Method
 This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base.
+Density was chosen deterministically between the models chosen for this merge. After testing many densities, I settled on 0.58 for each of the chosen models as it returned the highest EQ-Bench score. Not much testing was done with the weights, but I thought that I'd try gradients. Conceptually, Westlake and a Distilled version of Open Heremes are heavier in the initial layers, before Noromaid and AlphaMonarch come in to guide its reasoning and conversation.