DavidAU commited on
Commit
b3479c7
·
verified ·
1 Parent(s): 129bb14

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -3
README.md CHANGED
@@ -591,6 +591,7 @@ Please see sections below this for advanced usage, more details, settings, notes
591
 
592
  </small>
593
 
 
594
 
595
  ---
596
 
@@ -614,11 +615,11 @@ Some of my newer models repo cards use a limited form of this called a "prose co
614
 
615
  Roughly a pre-prompt / system role is embedded during each prompt and can act as a guide and/or set of directives for processing the prompt and/or containing generation instructions.
616
 
617
- A prose control is a simplifed version of this, which preceeds the main prompt(s) - but the idea / effect is relatively the same (pre-prompt/system role does have a slightly higher priority however).
618
 
619
  I strongly suggest you research these online, as they are a powerful addition to your generation toolbox.
620
 
621
- They are especially potent with newer model archs due to newer model types having stronger instruction following ablities AND increase context too.
622
 
623
  ---
624
 
@@ -777,6 +778,21 @@ For "text-gen-webui" and "Koboldcpp" these are directly accessible (and via Sill
777
 
778
  Use Mirostat sampling. "Top K", "Nucleus", "Tail Free" (TFS) and "Locally Typical" (TYPICAL) samplers are ignored if used. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
779
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
780
  "mirostat-lr"
781
 
782
  Mirostat learning rate, parameter eta (default: 0.1) " mirostat_tau "
@@ -1054,7 +1070,7 @@ for operation of CLASS 4 models for chat / role play and/or "smoother operation"
1054
 
1055
  For Class 3 models, "QUADRATIC" will have a slightly stronger effect than "DRY" relatively speaking.
1056
 
1057
- If you use Microstat sampler, keep in mind this will interact with these two advanced samplers too.
1058
 
1059
  And...
1060
 
 
591
 
592
  </small>
593
 
594
+ IMPORTANT: Make sure to review MIROSTAT sampler settings below, due to behaviour of this specific sampler / affect on parameters/other samplers.
595
 
596
  ---
597
 
 
615
 
616
  Roughly a pre-prompt / system role is embedded during each prompt and can act as a guide and/or set of directives for processing the prompt and/or containing generation instructions.
617
 
618
+ A prose control is a simplified version of this, which precedes the main prompt(s) - but the idea / effect is relatively the same (pre-prompt/system role does have a slightly higher priority however).
619
 
620
  I strongly suggest you research these online, as they are a powerful addition to your generation toolbox.
621
 
622
+ They are especially potent with newer model archs due to newer model types having stronger instruction following abilities AND increase context too.
623
 
624
  ---
625
 
 
778
 
779
  Use Mirostat sampling. "Top K", "Nucleus", "Tail Free" (TFS) and "Locally Typical" (TYPICAL) samplers are ignored if used. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
780
 
781
+ Paper: https://arxiv.org/abs/2007.14966
782
+
783
+ CRITICAL:
784
+
785
+ If you activate Mirostat when using "LLAMAcpp SERVER" and/or some LLAMA_CPP based apps this will VOID/DISABLE all parameters (excluding "penalties" ) AND all other SAMPLERS except "temp" parameter plus the following:
786
+
787
+ V1: n_vocab(model) (this is set internally by llamacpp), seed, mirostat_tau, mirostat_eta
788
+
789
+ V2: seed, mirostat_tau, mirostat_eta
790
+
791
+ It is unclear at this time if this occurs using Text Generation UI (and/or using LLAMACPP_HF loader), KoboldCPP or other customized LLM/AI app(s).
792
+
793
+ ("seed" is usually a random value.)
794
+
795
+
796
  "mirostat-lr"
797
 
798
  Mirostat learning rate, parameter eta (default: 0.1) " mirostat_tau "
 
1070
 
1071
  For Class 3 models, "QUADRATIC" will have a slightly stronger effect than "DRY" relatively speaking.
1072
 
1073
+ If you use Mirostat sampler, keep in mind this will interact with these two advanced samplers too.
1074
 
1075
  And...
1076