TheBloke commited on
Commit
df7fecc
·
1 Parent(s): cb41262

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -1
README.md CHANGED
@@ -296,7 +296,27 @@ And thank you again to a16z for their generous grant.
296
  <!-- original-model-card start -->
297
  # Original model card: Ross Ascends's Mistral 7B Dolphin2.1 Lima0.5
298
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
299
  exllama v2 4bpw quant: https://huggingface.co/RossAscends/Mistral7B_Dolphin2.1_LIMARP0.5_4bpw_exl2
300
- further quantization by TheBloke.
301
 
302
  <!-- original-model-card end -->
 
296
  <!-- original-model-card start -->
297
  # Original model card: Ross Ascends's Mistral 7B Dolphin2.1 Lima0.5
298
 
299
+ ehartford's merge of Mistral 7B 0.1 with his Dolphin 2.1 dataset
300
+
301
+ https://huggingface.co/ehartford/dolphin-2.1-mistral-7b
302
+
303
+ and
304
+
305
+ LIMA RP dataset applied as a lora at 0.5 weight
306
+
307
+ https://huggingface.co/lemonilia/limarp-llama2-v2/
308
+
309
+ Purpose of the model is to be RP-focused, smart, fast, and lightweight for users with low VRAM.
310
+
311
+ I've already built the exl2 4bpw quant (linked below), and it will run 8k ctx at around 6GB VRAM and respond to a full context at roughly 30tps (tested on my 3060) if exl2_hf loader is used with FA2 enabled.
312
+
313
+ Model has been tested by several users on the SillyTavern discord server, and run on Horde for a full day - with good results.
314
+
315
+ https://huggingface.co/RossAscends/Mistral7B_Dolphin2.1_LIMARP0.5_4bpw_exl2
316
+
317
+ Mistral or ChatML context presets both possible.
318
+
319
  exllama v2 4bpw quant: https://huggingface.co/RossAscends/Mistral7B_Dolphin2.1_LIMARP0.5_4bpw_exl2
320
+
321
 
322
  <!-- original-model-card end -->