DavidAU commited on
Commit
459cd0c
·
verified ·
1 Parent(s): 6bf9d7c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +178 -51
README.md CHANGED
@@ -38,7 +38,11 @@ The settings discussed in this document can also fix a number of model issues (<
38
  - General output quality.
39
  - Role play related issues.
40
 
41
- Likewise ALL the setting below can also improve model generation and/or general overall "smoothness" / "quality" of model operation.
 
 
 
 
42
 
43
  Even if you are not using my models, you may find this document useful for any model (any quant / full source) available online.
44
 
@@ -54,8 +58,7 @@ Every parameter, sampler and advanced sampler here affects per token generation
54
 
55
  This effect is cumulative especially with long output generation and/or multi-turn (chat, role play, COT).
56
 
57
- Likewise because of how modern AIs/LLMs operate the previously generated (quality) of the tokens generated
58
- affect the next tokens generated too.
59
 
60
  You will get higher quality operation overall - stronger prose, better answers, and a higher quality adventure.
61
 
@@ -90,32 +93,45 @@ These parameters/settings are considered both safe and default and in most cases
90
 
91
  ---
92
 
93
- <B>Llama CPP Parameters, Samplers and Advanced Samplers</B>
94
 
95
- Below are all the LLAMA_CPP parameters and samplers.
96
 
97
- I have added notes below each one for adjustment / enhancement(s) for specific use cases.
98
 
99
- Following this section will be additional samplers, which become available when using "llamacpp_HF" loader in https://github.com/oobabooga/text-generation-webui .
100
 
101
- The "llamacpp_HF" only requires the GGUF you want to use plus a few config files from "source repo" of the model.
102
 
103
- (this process is automated with this program, just enter the repo(s) urls -> it will fetch everything for you)
104
 
105
- Source files / Source models of my models are located here (also upper right menu on this page):
106
 
107
- [ https://huggingface.co/collections/DavidAU/d-au-source-files-for-gguf-exl2-awq-gptq-hqq-etc-etc-66b55cb8ba25f914cbf210be ]
108
 
109
- This allows access to very advanced samplers in addition to all the parameters / samplers here.
110
 
111
- For additional details on these samplers settings (including advanced ones) you may also want to check out:
112
 
113
- https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-Tab
 
 
 
 
 
 
 
114
 
115
- (NOTE: Not all of these "options" are available for GGUFS, including when you use "llamacpp_HF" loader)
116
 
117
  Note that https://github.com/LostRuins/koboldcpp also allows access to all LLAMACPP parameters/samplers too as well as additional advanced samplers too.
118
 
 
 
 
 
 
 
119
  Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
120
 
121
  In most cases all llama_cpp settings are available when using API / headless / server mode in "text-generation-webui", "koboldcpp", "Olama" and "lmstudio" (as well as other apps too).
@@ -126,6 +142,24 @@ https://github.com/ggerganov/llama.cpp
126
 
127
  (scroll down on the main page for more apps/programs to use GGUFs too that connect to / use the LLAMA-CPP package.)
128
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
  ---
130
 
131
  CRITICAL NOTES:
@@ -168,9 +202,53 @@ Imatrix quants generally improve all quants, and also allow you to use smaller q
168
 
169
  IE: Instead of using a q4KM, you might be able to run an IQ3_M and get close to Q4KM's quality, but at a higher token per second speed and have more VRAM for context.
170
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
171
 
172
  ------------------------------------------------------------------------------
173
- PRIMARY PARAMETERS:
174
  ------------------------------------------------------------------------------
175
 
176
  These parameters will have SIGNIFICANT effect on prose, generation, length and content; with temp being the most powerful.
@@ -221,7 +299,7 @@ Similar to top_p, but select instead only the top_k most likely tokens. Higher v
221
 
222
  Bring this up to 80-120 for a lot more word choice, and below 40 for simpler word choices.
223
 
224
- As this parameter operates in conjection with "top-p" and "min-p" all three should be carefully adjusted one at a time.
225
 
226
  <B>NOTE - "CORE" Testing with "TEMP":</B>
227
 
@@ -243,7 +321,7 @@ Then test "at temp" with your prompt(s) to see the MODELS in action. (5-10 gener
243
 
244
 
245
  ------------------------------------------------------------------------------
246
- PENALITY SAMPLERS:
247
  ------------------------------------------------------------------------------
248
 
249
  These samplers "trim" or "prune" output in real time.
@@ -252,6 +330,8 @@ The longer the generation, the stronger overall effect but that all depends on "
252
 
253
  For creative use cases, these samplers can alter prose generation in interesting ways.
254
 
 
 
255
  CLASS 4: For these models it is important to activate / set all samplers as noted for maximum quality and control.
256
 
257
  <B>PRIMARY:</B>
@@ -313,44 +393,32 @@ Generally this is not used.
313
 
314
 
315
  ------------------------------------------------------------------------------
316
- SECONDARY SAMPLERS / FILTERS:
317
  ------------------------------------------------------------------------------
318
 
 
319
 
320
- <B>tfs</B>
321
-
322
- tail free sampling, parameter z (default: 1.0, 1.0 = disabled)
323
-
324
- Tries to detect a tail of low-probability tokens in the distribution and removes those tokens. The closer to 0, the more discarded tokens.
325
- ( https://www.trentonbricken.com/Tail-Free-Sampling/ )
326
-
327
-
328
- <B>typical</B>
329
-
330
- locally typical sampling, parameter p (default: 1.0, 1.0 = disabled)
331
-
332
- If not set to 1, select only tokens that are at least this much more likely to appear than random tokens, given the prior text.
333
 
 
334
 
335
  <B>mirostat</B>
336
 
337
- use Mirostat sampling. "Top K", "Nucleus", "Tail Free" (TFS) and "Locally Typical" (TYPICAL) samplers are ignored if used.
338
- (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
339
 
340
  <B>mirostat-lr</B>
341
 
342
  Mirostat learning rate, parameter eta (default: 0.1) " mirostat_tau "
343
 
 
 
344
  <B>mirostat-ent</B>
345
 
346
  Mirostat target entropy, parameter tau (default: 5.0) " mirostat_eta "
347
 
348
- Activates the Mirostat sampling technique. It aims to control perplexity during sampling. See the paper. (https://arxiv.org/abs/2007.14966)
349
-
350
- mirostat_tau: 5-8 is a good value.
351
-
352
  mirostat_eta: 0.1 is a good value.
353
 
 
354
 
355
  This is the big one ; activating this will help with creative generation. It can also help with stability. Also note which
356
  samplers are disabled/ignored here, and that "mirostat_eta" is a learning rate.
@@ -360,9 +428,9 @@ This is both a sampler (and pruner) and enhancement all in one.
360
  It also has two modes of generation "1" and "2" - test both with 5-10 generations of the same prompt. Make adjustments, and repeat.
361
 
362
 
363
- For Class 3 models it is suggested to use this to assist with generation (min settings).
364
 
365
- For Class 4 models it is highly recommended with Microstat 1 or 2 + mirostat_tau @ 6 to 8 and mirostat_eta at .1 to .5
366
 
367
 
368
  <B>dynatemp-range</B>
@@ -379,11 +447,11 @@ Activates Dynamic Temperature. This modifies temperature to range between "dynat
379
 
380
  This allows the model to CHANGE temp during generation. This can greatly affect creativity, dialog, and other contrasts.
381
 
382
- For Kobold a converter is available and in oobabooga/text-generation-webui you just enter low/high/exp.
383
 
384
  Class 4 only: Suggested this is on, with a high/low of .8 to 1.8 (note the range here of "1" between high and low); with exponent to 1 (however below 0 or above work too)
385
 
386
- To set manually (IE: Api, lmstudio, etc) using "range" and "exp" ; this is a bit more tricky: (example is to set range from .8 to 1.8)
387
 
388
  1 - Set the "temp" to 1.3 (the regular temp parameter)
389
 
@@ -394,6 +462,21 @@ To set manually (IE: Api, lmstudio, etc) using "range" and "exp" ; this is a bit
394
  This is both an enhancement and in some ways fixes issues in a model when too little temp (or too much/too much of the same) affects generation.
395
 
396
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
397
  <B>xtc-probability</B>
398
 
399
  xtc probability (default: 0.0, 0.0 = disabled)
@@ -409,8 +492,6 @@ If 2 or more tokens have probability above this threshold, consider removing all
409
  XTC is a new sampler, that adds an interesting twist in generation.
410
  Suggest you experiment with this one, with other advanced samplers disabled to see its affects.
411
 
412
-
413
-
414
  <B>l, logit-bias TOKEN_ID(+/-)BIAS </B>
415
 
416
  modifies the likelihood of token appearing in the completion,
@@ -427,17 +508,49 @@ I suggest you get some "bad outputs" ; get the "tokens" (actual number for the "
427
  Careful testing is required, as this can have unclear side effects.
428
 
429
 
430
- ------------------------------------------------------------------------------
431
- ADVANCED SAMPLERS:
432
- ------------------------------------------------------------------------------
 
 
 
 
433
 
434
- I am not going to touch on all of them ; just the main ones ; for more info see:
435
 
436
  https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-Tab
437
 
 
 
438
  Keep in mind these parameters/samplers become available (for GGUFs) in "oobabooga/text-generation-webui" when you use the llamacpp_HF loader.
439
 
440
- What I will touch on here are special settings for CLASS 3 and CLASS 4 models.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
441
 
442
  For CLASS 3 you can use one, two or both.
443
 
@@ -454,6 +567,8 @@ You may therefore want to experiment to with dropping the settings (SLOWLY) for
454
 
455
  <B>DRY:</B>
456
 
 
 
457
  Class 3:
458
 
459
  dry_multiplier: .8
@@ -473,6 +588,7 @@ dry_base: 1.15 to 1.5
473
 
474
  <B>QUADRATIC SAMPLING:</B>
475
 
 
476
 
477
  Class 3:
478
 
@@ -487,14 +603,25 @@ smoothing_factor: 3 to 5 (or higher)
487
  smoothing_curve: 1.5 to 2.
488
 
489
 
 
 
 
 
 
 
 
 
 
 
 
490
  IMPORTANT:
491
 
492
  Keep in mind that these settings/samplers work in conjunction with "penalties" ; which is especially important
493
  for operation of CLASS 4 models for chat / role play and/or "smoother operation".
494
 
495
- For Class 3 models, "QUADRATIC" will have a stronger effect than "DRY" relatively speaking.
496
 
497
- If you use Microstat, keep in mind this will interact with these two advanced samplers too.
498
 
499
  Finally:
500
 
 
38
  - General output quality.
39
  - Role play related issues.
40
 
41
+ Likewise ALL the setting (parameters, samplers and advanced samplers) below can also improve model generation and/or general overall "smoothness" / "quality" of model operation:
42
+
43
+ - all parameters and samplers available via LLAMACPP (and most apps that run / use LLAMACPP)
44
+ - all parameters (including some not in Lllamacpp), samplers and advanced samplers ("Dry", "Quadratic", "Microstat") in oobabooga/text-generation-webui including llamacpp_HF loader (allowing a lot more samplers)
45
+ - all parameters (including some not in Lllamacpp), samplers and advanced samplers ("Dry", "Quadratic", "Microstat") in KoboldCPP (including Anti-slop filters)
46
 
47
  Even if you are not using my models, you may find this document useful for any model (any quant / full source) available online.
48
 
 
58
 
59
  This effect is cumulative especially with long output generation and/or multi-turn (chat, role play, COT).
60
 
61
+ Likewise because of how modern AIs/LLMs operate the previously generated (quality) of the tokens generated affect the next tokens generated too.
 
62
 
63
  You will get higher quality operation overall - stronger prose, better answers, and a higher quality adventure.
64
 
 
93
 
94
  ---
95
 
96
+ <B>SOURCE FILES for my Models:</B>
97
 
98
+ Source files / Source models of my models are located here (also upper right menu on this page):
99
 
100
+ [ https://huggingface.co/collections/DavidAU/d-au-source-files-for-gguf-exl2-awq-gptq-hqq-etc-etc-66b55cb8ba25f914cbf210be ]
101
 
102
+ You will need the config files to use "llamacpp_HF" loader ("text-generation-webui") [ https://github.com/oobabooga/text-generation-webui ]
103
 
104
+ You can also use the full source in "text-generation-webui" too.
105
 
106
+ As an alternative you can use GGUFs directly in "KOBOLDCPP" without the "config files" and still use almost all the parameters, samplers and advanced samplers.
107
 
108
+ <B>Parameters, Samplers and Advanced Samplers</B>
109
 
110
+ In section 1 a,b, and c, below are all the LLAMA_CPP parameters and samplers.
111
 
112
+ I have added notes below each one for adjustment / enhancement(s) for specific use cases.
113
 
114
+ TEXT-GENERATION-WEBUI
115
 
116
+ In section 2, will be additional samplers, which become available when using "llamacpp_HF" loader in https://github.com/oobabooga/text-generation-webui
117
+ AND/OR https://github.com/LostRuins/koboldcpp ("KOBOLDCPP").
118
+
119
+ The "llamacpp_HF" (for "text-generation-webui") only requires the GGUF you want to use plus a few config files from "source repo" of the model.
120
+
121
+ (this process is automated with this program, just enter the repo(s) urls -> it will fetch everything for you)
122
+
123
+ This allows access to very advanced samplers in addition to all the parameters / samplers here.
124
 
125
+ KOBOLDCPP:
126
 
127
  Note that https://github.com/LostRuins/koboldcpp also allows access to all LLAMACPP parameters/samplers too as well as additional advanced samplers too.
128
 
129
+ You can use almost all parameters, samplers and advanced samplers using "KOBOLDCPP" without the need to get the source config files (the "llamacpp_HF" step).
130
+
131
+ Note: This program has one of the newest samplers called "Anti-slop" which allows phrase/word banning at the generation level.
132
+
133
+ OTHER PROGRAMS:
134
+
135
  Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
136
 
137
  In most cases all llama_cpp settings are available when using API / headless / server mode in "text-generation-webui", "koboldcpp", "Olama" and "lmstudio" (as well as other apps too).
 
142
 
143
  (scroll down on the main page for more apps/programs to use GGUFs too that connect to / use the LLAMA-CPP package.)
144
 
145
+ DETAILED NOTES ON PARAMETERS, SAMPLERS and ADVANCED SAMPLERS:
146
+
147
+ For additional details on these samplers settings (including advanced ones) you may also want to check out:
148
+
149
+ https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-Tab
150
+
151
+ (NOTE: Not all of these "options" are available for GGUFS, including when you use "llamacpp_HF" loader in "text-generation-webui" )
152
+
153
+ Additional Links:
154
+
155
+ => DRY => https://github.com/oobabooga/text-generation-webui/pull/5677
156
+ => DRY => https://www.reddit.com/r/KoboldAI/comments/1e49vpt/dry_sampler_questionsthat_im_sure_most_of_us_are/
157
+ => DRY => https://www.reddit.com/r/KoboldAI/comments/1eo4r6q/dry_settings_questions/
158
+ => Samplers (videos) : https://gist.github.com/kalomaze/4473f3f975ff5e5fade06e632498f73e
159
+ => Creative Writing -> https://www.reddit.com/r/LocalLLaMA/comments/1c36ieb/comparing_sampling_techniques_for_creative/
160
+ => Parameters => https://arxiv.org/html/2408.13586v1
161
+ => Stats on some parameters => https://github.com/ZhouYuxuanYX/Benchmarking-and-Guiding-Adaptive-Sampling-Decoding-for-LLMs
162
+
163
  ---
164
 
165
  CRITICAL NOTES:
 
202
 
203
  IE: Instead of using a q4KM, you might be able to run an IQ3_M and get close to Q4KM's quality, but at a higher token per second speed and have more VRAM for context.
204
 
205
+ ---
206
+
207
+ HOW TO TEST EACH PARAMETER(s), SAMPLER(s) and ADVANCED SAMPLER(s)
208
+
209
+ 1 - Set temp to 0 (zero) and set your basic parameters, and use a prompt to get a "default" generation. A creative prompt will work better here.
210
+
211
+ 2 - If you want to test basic parameter changes, test ONE at a time, then compare output (answer quality, word choice, sentence size/construction, general output qualities) to your "default" generation.
212
+
213
+ 3 - Then start testing TWO parameters at a time, and comparing again. Keep in mind parameters (all) interact with each other.
214
+
215
+ 4 - Samplers -> Reset your basic parameters, (temp still at zero) and test each one of these, one at a time. Then adjust settings, test again.
216
+
217
+ 5 - Once you have an "idea" of how each affects your "test prompt" , now test at "temp" (not zero). It may take five to ten generation to get a rough idea.
218
+
219
+ Yes, testing is a lot of work - but once you get all the parameter(s) and/or sampler(s) dialed in - it is worth it.
220
+
221
+ IMPORTANT: Use a "fresh chat" PER TEST (you will contaminate the results otherwise). Never use the same chat for multiple tests -> exception: Regens.
222
+
223
+ Keep in mind that parameters, samplers and advanced samplers can affect the model on a per token generation basis AND/OR on a multi-token / phrase / sentence / paragraph
224
+ and even complete generation basis.
225
+
226
+ Everything is cumulative here regardless if the parameter/sampler affects per token or multi-token basis because of how models "look back" to see what was generated in some cases.
227
+
228
+ And of course... each model will be different too.
229
+
230
+ All that being said, it is a good idea to have specific generation quality "goals" in mind.
231
+
232
+ Likewise, at my repo, I post example generations so you can get an idea (but not complete picture) of a model's generation abilities.
233
+
234
+ The best way to control generation is STILL with your prompt(s) - including pre-prompts/system role. The latest gen models (and archs) have very strong
235
+ instruction following so many times better (or just included!) instructions in your prompts can make a world of difference.
236
+
237
+ Not sure if the model understands your prompt(s)?
238
+
239
+ Ask it ->
240
+
241
+ "Check my prompt below and tell me how to make it clearer?" (prompt after this line)
242
+
243
+ "For my prompt below, explain the steps you wound take to execute it" (prompt after this line)
244
+
245
+ This will help the model fine tune your prompt so IT understands it.
246
+
247
+ However sometimes parameters and/or samplers are required to better "wrangle" the model and getting to perform to its maximum potential and/or fine tune it to your use case(s).
248
+
249
 
250
  ------------------------------------------------------------------------------
251
+ Section 1a : PRIMARY PARAMETERS - ALL APPS:
252
  ------------------------------------------------------------------------------
253
 
254
  These parameters will have SIGNIFICANT effect on prose, generation, length and content; with temp being the most powerful.
 
299
 
300
  Bring this up to 80-120 for a lot more word choice, and below 40 for simpler word choices.
301
 
302
+ As this parameter operates in conjunction with "top-p" and "min-p" all three should be carefully adjusted one at a time.
303
 
304
  <B>NOTE - "CORE" Testing with "TEMP":</B>
305
 
 
321
 
322
 
323
  ------------------------------------------------------------------------------
324
+ Section 1b : PENALITY SAMPLERS - ALL APPS:
325
  ------------------------------------------------------------------------------
326
 
327
  These samplers "trim" or "prune" output in real time.
 
330
 
331
  For creative use cases, these samplers can alter prose generation in interesting ways.
332
 
333
+ Penalty parameters affect both per token and part of OR entire generation (depending on settings / output length).
334
+
335
  CLASS 4: For these models it is important to activate / set all samplers as noted for maximum quality and control.
336
 
337
  <B>PRIMARY:</B>
 
393
 
394
 
395
  ------------------------------------------------------------------------------
396
+ Section 1c : SECONDARY SAMPLERS / FILTERS - ALL APPS:
397
  ------------------------------------------------------------------------------
398
 
399
+ In some AI/LLM apps, these may only be available via JSON file modification and/or API.
400
 
401
+ For "text-gen-webui" and "Koboldcpp" these are directly accessible.
 
 
 
 
 
 
 
 
 
 
 
 
402
 
403
+ i) OVERALL GENERATION CHANGES (affect per token as well as over all generation):
404
 
405
  <B>mirostat</B>
406
 
407
+ Use Mirostat sampling. "Top K", "Nucleus", "Tail Free" (TFS) and "Locally Typical" (TYPICAL) samplers are ignored if used. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
 
408
 
409
  <B>mirostat-lr</B>
410
 
411
  Mirostat learning rate, parameter eta (default: 0.1) " mirostat_tau "
412
 
413
+ mirostat_tau: 5-8 is a good value.
414
+
415
  <B>mirostat-ent</B>
416
 
417
  Mirostat target entropy, parameter tau (default: 5.0) " mirostat_eta "
418
 
 
 
 
 
419
  mirostat_eta: 0.1 is a good value.
420
 
421
+ Activates the Mirostat sampling technique. It aims to control perplexity during sampling. See the paper. ( https://arxiv.org/abs/2007.14966 )
422
 
423
  This is the big one ; activating this will help with creative generation. It can also help with stability. Also note which
424
  samplers are disabled/ignored here, and that "mirostat_eta" is a learning rate.
 
428
  It also has two modes of generation "1" and "2" - test both with 5-10 generations of the same prompt. Make adjustments, and repeat.
429
 
430
 
431
+ CLASS 3: models it is suggested to use this to assist with generation (min settings).
432
 
433
+ CLASS 4: models it is highly recommended with Microstat 1 or 2 + mirostat_tau @ 6 to 8 and mirostat_eta at .1 to .5
434
 
435
 
436
  <B>dynatemp-range</B>
 
447
 
448
  This allows the model to CHANGE temp during generation. This can greatly affect creativity, dialog, and other contrasts.
449
 
450
+ For Koboldcpp a converter is available and in oobabooga/text-generation-webui you just enter low/high/exp.
451
 
452
  Class 4 only: Suggested this is on, with a high/low of .8 to 1.8 (note the range here of "1" between high and low); with exponent to 1 (however below 0 or above work too)
453
 
454
+ To set manually (IE: Api, lmstudio, Llamacpp, etc) using "range" and "exp" ; this is a bit more tricky: (example is to set range from .8 to 1.8)
455
 
456
  1 - Set the "temp" to 1.3 (the regular temp parameter)
457
 
 
462
  This is both an enhancement and in some ways fixes issues in a model when too little temp (or too much/too much of the same) affects generation.
463
 
464
 
465
+ ii) PER TOKEN CHANGES:
466
+
467
+ <B>tfs</B>
468
+
469
+ Tail free sampling, parameter z (default: 1.0, 1.0 = disabled)
470
+
471
+ Tries to detect a tail of low-probability tokens in the distribution and removes those tokens. The closer to 0, the more discarded tokens.
472
+ ( https://www.trentonbricken.com/Tail-Free-Sampling/ )
473
+
474
+ <B>typical</B>
475
+
476
+ Locally typical sampling, parameter p (default: 1.0, 1.0 = disabled)
477
+
478
+ If not set to 1, select only tokens that are at least this much more likely to appear than random tokens, given the prior text.
479
+
480
  <B>xtc-probability</B>
481
 
482
  xtc probability (default: 0.0, 0.0 = disabled)
 
492
  XTC is a new sampler, that adds an interesting twist in generation.
493
  Suggest you experiment with this one, with other advanced samplers disabled to see its affects.
494
 
 
 
495
  <B>l, logit-bias TOKEN_ID(+/-)BIAS </B>
496
 
497
  modifies the likelihood of token appearing in the completion,
 
508
  Careful testing is required, as this can have unclear side effects.
509
 
510
 
511
+ ------------------------------------------------------------------------------------------------------------------------------------------------------------
512
+ SECTION 2: ADVANCED SAMPLERS - "text-generation-webui" / "KOBOLDCPP":
513
+
514
+ Additional Parameters / Samplers, including "DRY", "QUADRATIC" and "ANTI-SLOP".
515
+ ------------------------------------------------------------------------------------------------------------------------------------------------------------
516
+
517
+ Hopefully these samplers / controls will be LLAMACPP and available to all users via AI/LLM apps soon.
518
 
519
+ For more info on what they do / how they affect generation see:
520
 
521
  https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-Tab
522
 
523
+ (also see the section above "Additional Links" for more info on the parameters/samplers)
524
+
525
  Keep in mind these parameters/samplers become available (for GGUFs) in "oobabooga/text-generation-webui" when you use the llamacpp_HF loader.
526
 
527
+ Most of these are also available in KOBOLDCPP too (via settings -> samplers) after start up (no "llamacpp_HF loader" step required).
528
+
529
+ I am not going to touch on all of samplers / parameters, just the main ones at the moment.
530
+
531
+ However, you should also check / test operation of:
532
+
533
+ a] Affects per token generation:
534
+
535
+ - top_a
536
+ - epsilon_cutoff
537
+ - eta_cutoff
538
+ - no_repeat_ngram_size
539
+
540
+ b] Affects generation including phrase, sentence, paragraph and entire generation:
541
+
542
+ - no_repeat_ngram_size
543
+ - encoder_repetition_penalty
544
+ - guidance_scale (with "Negative prompt" ) => this is like a pre-prompt/system role prompt.
545
+ - Disabling (BOS TOKEN) this can make the replies more creative.
546
+ - Custom stopping strings
547
+
548
+ Note: "no_repeat_ngram_size" appears in both because it can impact per token OR per phrase depending on settings.
549
+
550
+
551
+ <B>MAIN ADVANCED SAMPLERS (affects per token AND overall generation): </B>
552
+
553
+ What I will touch on here are special settings for CLASS 3 and CLASS 4 models (for the first TWO samplers).
554
 
555
  For CLASS 3 you can use one, two or both.
556
 
 
567
 
568
  <B>DRY:</B>
569
 
570
+ Dry affects repetition (and repeat "penalty") at the word, phrase, sentence and even paragraph level. Read about "DRY" above, in the "Additional Links" links section above.
571
+
572
  Class 3:
573
 
574
  dry_multiplier: .8
 
588
 
589
  <B>QUADRATIC SAMPLING:</B>
590
 
591
+ This sampler alters the "score" of ALL TOKENS at the time of generation. See "Additional Links" links section above for more information.
592
 
593
  Class 3:
594
 
 
603
  smoothing_curve: 1.5 to 2.
604
 
605
 
606
+ <B>ANTI-SLOP - Kolbaldcpp only</B>
607
+
608
+ Hopefully this powerful sampler will soon appear in all LLM/AI apps.
609
+
610
+ You can access this in the KoboldCPP app, under "context" -> "tokens" on the main page of the app after start up.
611
+
612
+ This sampler allows banning words and phrases DURING generation, forcing the model to "make another choice".
613
+
614
+ This is a game changer in custom real time control of the model.
615
+
616
+
617
  IMPORTANT:
618
 
619
  Keep in mind that these settings/samplers work in conjunction with "penalties" ; which is especially important
620
  for operation of CLASS 4 models for chat / role play and/or "smoother operation".
621
 
622
+ For Class 3 models, "QUADRATIC" will have a slightly stronger effect than "DRY" relatively speaking.
623
 
624
+ If you use Microstat sampler, keep in mind this will interact with these two advanced samplers too.
625
 
626
  Finally:
627