DavidAU
/

DeepSeek-R1-Distill-Llama-3.1-16.5B-Brainstorm-gguf

Model card Files Files and versions Community

DavidAU commited on Feb 10

Commit

9e132bc

verified ·

1 Parent(s): 208b3d6

Update README.md

Browse files

Files changed (1) hide show

README.md +13 -11

README.md CHANGED Viewed

@@ -34,7 +34,7 @@ tags:
 pipeline_tag: text-generation
 ---
-<h2>Experimental Deepseek-R1-Llama3.1 with Brainstorm 40x, 16.5B. (72 layers, 643 tensors) </h2>
 <img src="deepseek.jpg" style="float:right; width:300px; height:300px; padding:10px;">
@@ -46,6 +46,8 @@ Keep in mind this model is experimental and may require one or more regens to wo
 Brainstorm 40x is by DavidAU, and extends the "decision making" and "creativity" of an LLM/AI.
 The "thinking/reasoning" tech (for the model at this repo) is from the original Llama 3.1 "Distill" model from Deepseek:
 [ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B ]
@@ -64,16 +66,16 @@ The Grand Horrors retain all of their "horror/creative power" and are augmented
 <b>CRITICAL SETTINGS:</B>
-1. Set Temp between 0 and .8, higher than this "think" functions will not activate. The most "stable" temp seems to be .6, with a variance of +-0.05. Lower for more "logic" reasoning, raise it for more "creative" reasoning (max .8 or so). Also set context to at least 4096, to account for "thoughts" generation.
-2. Set "repeat penalty" to 1.02 to 1.08 and "repeat penalty range" to 64-128.
-3. This model requires a Llama 3 Instruct and/or Command-R chat template. (see notes on "System Prompt" / "Role" below)
-4. It may take one or more regens for "thinking" to "activate"... depending on your prompt.
-5. If you enter a prompt without implied "step by step" requirements, "thinking" (one or more) will activate AFTER first generation. You will also get a lot of variations - some will continue the generation, others will talk about how to improve it, and some (ie generation of a scene) will cause the characters to "reason" about this situation. In some cases, the model will ask you to continue generation / thoughts too. In some cases the model's "thoughts" may appear in the generation itself.
-6. State the word size length max IN THE PROMPT for best results, especially for activation of "thinking."
-7. I have found opening a "new chat" per prompt works best with "thinking/reasoning activation", with temp .6
-8. Depending on your AI app, "thoughts" may appear with "<THINK>" and "</THINK>" tags AND/OR the AI will generate "thoughts" directly in the main output or later output(s).
-9. Although quant Q4KM was used for testing/examples, higher quants will provide better generation / more sound "reasoning/thinking".
-10. To repeat: If you exceed temp of .8 or so, "thinking" processes may stop or change form or you will get "normal" model generation.
 ---

 pipeline_tag: text-generation
 ---
+<h2>Deepseek-R1-Llama3.1 with Brainstorm 40x, 16.5B. (72 layers, 643 tensors) </h2>
 <img src="deepseek.jpg" style="float:right; width:300px; height:300px; padding:10px;">
 Brainstorm 40x is by DavidAU, and extends the "decision making" and "creativity" of an LLM/AI.
+Higher temps will result in deeper, richer "thoughts"... and frankly more interesting ones too.
 The "thinking/reasoning" tech (for the model at this repo) is from the original Llama 3.1 "Distill" model from Deepseek:
 [ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B ]
 <b>CRITICAL SETTINGS:</B>
+1. Set Temp between 0 and .8, higher than this "think" functions MAY not activate. The most "stable" temp seems to be .6, with a variance of +-0.05. Lower for more "logic" reasoning, raise it for more "creative" reasoning (max .8 or so). Also set context to at least 4096, to account for "thoughts" generation.
+2. Set "repeat penalty" to 1.02 to 1.08 and "repeat penalty range" to 64-128.
+3. Temps 1+, 2+ will deepen thoughts, conclusions, and generation thinking.
+4. This model requires a Llama 3 Instruct and/or Command-R chat template. (see notes on "System Prompt" / "Role" below)
+5. It may take one or more regens for "thinking" to "activate"... depending on your prompt.
+6. If you enter a prompt without implied "step by step" requirements, "thinking" (one or more) will activate AFTER first generation. You will also get a lot of variations - some will continue the generation, others will talk about how to improve it, and some (ie generation of a scene) will cause the characters to "reason" about this situation. In some cases, the model will ask you to continue generation / thoughts too. In some cases the model's "thoughts" may appear in the generation itself.
+7. State the word size length max IN THE PROMPT for best results, especially for activation of "thinking."
+8. I have found opening a "new chat" per prompt works best with "thinking/reasoning activation", with temp .6
+9. Depending on your AI app, "thoughts" may appear with "< THINK >" and "</ THINK >" tags AND/OR the AI will generate "thoughts" directly in the main output or later output(s).
+10. Although quant Q4KM was used for testing/examples, higher quants will provide better generation / more sound "reasoning/thinking".
 ---