parameters guide
samplers guide
model generation
role play settings
quant selection
arm quants
iq quants vs q quants
optimal model setting
gibberish fixes
coherence
instructing following
quality generation
chat settings
quality settings
llamacpp server
llamacpp
lmstudio
sillytavern
koboldcpp
backyard
ollama
model generation steering
steering
model generation fixes
text generation webui
ggufs
exl2
full precision
quants
imatrix
neo imatrix
Update README.md
Browse files
README.md
CHANGED
@@ -591,6 +591,7 @@ Please see sections below this for advanced usage, more details, settings, notes
|
|
591 |
|
592 |
</small>
|
593 |
|
|
|
594 |
|
595 |
---
|
596 |
|
@@ -614,11 +615,11 @@ Some of my newer models repo cards use a limited form of this called a "prose co
|
|
614 |
|
615 |
Roughly a pre-prompt / system role is embedded during each prompt and can act as a guide and/or set of directives for processing the prompt and/or containing generation instructions.
|
616 |
|
617 |
-
A prose control is a
|
618 |
|
619 |
I strongly suggest you research these online, as they are a powerful addition to your generation toolbox.
|
620 |
|
621 |
-
They are especially potent with newer model archs due to newer model types having stronger instruction following
|
622 |
|
623 |
---
|
624 |
|
@@ -777,6 +778,21 @@ For "text-gen-webui" and "Koboldcpp" these are directly accessible (and via Sill
|
|
777 |
|
778 |
Use Mirostat sampling. "Top K", "Nucleus", "Tail Free" (TFS) and "Locally Typical" (TYPICAL) samplers are ignored if used. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
|
779 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
780 |
"mirostat-lr"
|
781 |
|
782 |
Mirostat learning rate, parameter eta (default: 0.1) " mirostat_tau "
|
@@ -1054,7 +1070,7 @@ for operation of CLASS 4 models for chat / role play and/or "smoother operation"
|
|
1054 |
|
1055 |
For Class 3 models, "QUADRATIC" will have a slightly stronger effect than "DRY" relatively speaking.
|
1056 |
|
1057 |
-
If you use
|
1058 |
|
1059 |
And...
|
1060 |
|
|
|
591 |
|
592 |
</small>
|
593 |
|
594 |
+
IMPORTANT: Make sure to review MIROSTAT sampler settings below, due to behaviour of this specific sampler / affect on parameters/other samplers.
|
595 |
|
596 |
---
|
597 |
|
|
|
615 |
|
616 |
Roughly a pre-prompt / system role is embedded during each prompt and can act as a guide and/or set of directives for processing the prompt and/or containing generation instructions.
|
617 |
|
618 |
+
A prose control is a simplified version of this, which precedes the main prompt(s) - but the idea / effect is relatively the same (pre-prompt/system role does have a slightly higher priority however).
|
619 |
|
620 |
I strongly suggest you research these online, as they are a powerful addition to your generation toolbox.
|
621 |
|
622 |
+
They are especially potent with newer model archs due to newer model types having stronger instruction following abilities AND increase context too.
|
623 |
|
624 |
---
|
625 |
|
|
|
778 |
|
779 |
Use Mirostat sampling. "Top K", "Nucleus", "Tail Free" (TFS) and "Locally Typical" (TYPICAL) samplers are ignored if used. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
|
780 |
|
781 |
+
Paper: https://arxiv.org/abs/2007.14966
|
782 |
+
|
783 |
+
CRITICAL:
|
784 |
+
|
785 |
+
If you activate Mirostat when using "LLAMAcpp SERVER" and/or some LLAMA_CPP based apps this will VOID/DISABLE all parameters (excluding "penalties" ) AND all other SAMPLERS except "temp" parameter plus the following:
|
786 |
+
|
787 |
+
V1: n_vocab(model) (this is set internally by llamacpp), seed, mirostat_tau, mirostat_eta
|
788 |
+
|
789 |
+
V2: seed, mirostat_tau, mirostat_eta
|
790 |
+
|
791 |
+
It is unclear at this time if this occurs using Text Generation UI (and/or using LLAMACPP_HF loader), KoboldCPP or other customized LLM/AI app(s).
|
792 |
+
|
793 |
+
("seed" is usually a random value.)
|
794 |
+
|
795 |
+
|
796 |
"mirostat-lr"
|
797 |
|
798 |
Mirostat learning rate, parameter eta (default: 0.1) " mirostat_tau "
|
|
|
1070 |
|
1071 |
For Class 3 models, "QUADRATIC" will have a slightly stronger effect than "DRY" relatively speaking.
|
1072 |
|
1073 |
+
If you use Mirostat sampler, keep in mind this will interact with these two advanced samplers too.
|
1074 |
|
1075 |
And...
|
1076 |
|