lizpreciatior commited on
Commit
b366c0b
·
1 Parent(s): 2e4c32c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -9
README.md CHANGED
@@ -4,7 +4,7 @@ license: cc-by-nc-2.0
4
 
5
 
6
  # lzlv_70B
7
- ## A Mythomax/MLewd_13B style merge of selected 70B models
8
 
9
  A multi-model merge of several LLaMA2 70B finetunes for roleplaying and creative work. The goal was to create a model that combines creativity with intelligence for an enhanced experience.
10
 
@@ -19,14 +19,12 @@ GGUF 4_K_M + 5_K_M can be found here: https://huggingface.co/lizpreciatior/lzlv_
19
 
20
  Models used:
21
  - **NousResearch/Nous-Hermes-Llama2-70b** - A great model for roleplaying, but not the best at following complex instructions.
22
- - **Xwin-LM/Xwin-LM-7B-V0.1** - Excellent at following instructions and quite creative out of the box, so it seemed like the best available model to act as the base of the merge.
23
- - **Doctor-Shotgun/Mythospice-70b** - The wildcard of the three. I was looking for a creative, NSFW-oriented model and came across this while digging through hf. I had never heard of it before, and apparently no one had bothered to release a quantized version of this model. So I downloaded it and did it myself to test it. It turned out to be more or less what I was looking for as my third component, so I used it here.
24
 
25
- A big thank you to the creators of the models above. If you look up Mythospice, you will notice that it also includes Nous-Hermes so it's technically present twice in this mix. This is common practice in 13B merges so I didn't bother to correct it her either.
26
 
27
 
28
- I chose three of my favourite models that seemed to complement each other, and adjusted the ratios according to my preference.
29
-
30
  The merging process was heavily inspired by Undi95's approach in Undi95/MXLewdMini-L2-13B. To be specific, the ratios are:
31
 
32
  Component 1: Merge of Mythospice x Xwin with SLERP gradient [0.25, 0.3, 0.5].
@@ -34,10 +32,10 @@ Component 2: Merge Xwin x Hermes with SLERP gradient [0.4, 0.3, 0.25].
34
 
35
  Finally, both Component 1 and Component 2 were merged with SLERP using weight 0.5.
36
 
37
- ## Advantages
38
 
39
- I tested this model for a few days before publishing it. It seems to retain the instruction-following capabilities of Xwin-70B, while seeming to have adapted a lot of the creativity of the other two models.
40
- It handles my more complex scenarios that creative models otherwise tend to struggle with quite well. At the same time, its outputs felt more creative and possibly a bit more nsfw-inclined than Xwin-70b.
41
  So, is it better? Feels like it to me, subjectively. Is it really better? No clue, test it.
42
 
43
  ## Prompt format:
 
4
 
5
 
6
  # lzlv_70B
7
+ ## A Mythomax/MLewd_13B-style merge of selected 70B models
8
 
9
  A multi-model merge of several LLaMA2 70B finetunes for roleplaying and creative work. The goal was to create a model that combines creativity with intelligence for an enhanced experience.
10
 
 
19
 
20
  Models used:
21
  - **NousResearch/Nous-Hermes-Llama2-70b** - A great model for roleplaying, but not the best at following complex instructions.
22
+ - **Xwin-LM/Xwin-LM-7B-V0.1** - Excellent at following instructions and quite creative out of the box, so it seemed like the best available model to act as the base for the merge.
23
+ - **Doctor-Shotgun/Mythospice-70b** - The wildcard of the three. I was looking for a creative, NSFW-oriented model and came across this while digging through hf. I hadn't heard of it before and apparently no one had bothered to release a quantized version of this model. So I downloaded it and did it myself to test it. It turned out to be more or less what I was looking for as my third component, so I used it here.
24
 
25
+ A big thank you to the creators of the models above. If you look up Mythospice, you will notice that it also includes Nous-Hermes so it's technically present twice in this mix. This is apparently common practice amongst the cool kids who do 13B models so I don't think this hurts the model.
26
 
27
 
 
 
28
  The merging process was heavily inspired by Undi95's approach in Undi95/MXLewdMini-L2-13B. To be specific, the ratios are:
29
 
30
  Component 1: Merge of Mythospice x Xwin with SLERP gradient [0.25, 0.3, 0.5].
 
32
 
33
  Finally, both Component 1 and Component 2 were merged with SLERP using weight 0.5.
34
 
35
+ ## Peformance
36
 
37
+ I tested this model for a few days before publishing it. It seems to more or less retain the instruction-following capabilities of Xwin-70B, while seeming to have adopted a lot of the creativity of the other two models.
38
+ It handled my more complex scenarios that creative models otherwise tend to struggle with quite well. At the same time, its outputs felt more creative and possibly a bit more nsfw-inclined than Xwin-70b.
39
  So, is it better? Feels like it to me, subjectively. Is it really better? No clue, test it.
40
 
41
  ## Prompt format: