parameters guide
samplers guide
model generation
role play settings
quant selection
arm quants
iq quants vs q quants
optimal model setting
gibberish fixes
coherence
instructing following
quality generation
chat settings
quality settings
llamacpp server
llamacpp
lmstudio
sillytavern
koboldcpp
backyard
ollama
model generation steering
steering
model generation fixes
text generation webui
ggufs
exl2
full precision
quants
imatrix
neo imatrix
Update README.md
Browse files
README.md
CHANGED
@@ -239,6 +239,87 @@ Imatrix quants generally improve all quants, and also allow you to use smaller q
|
|
239 |
|
240 |
IE: Instead of using a q4KM, you might be able to run an IQ3_M and get close to Q4KM's quality, but at a higher token per second speed and have more VRAM for context.
|
241 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
242 |
---
|
243 |
|
244 |
HOW TO TEST EACH PARAMETER(s), SAMPLER(s) and ADVANCED SAMPLER(s)
|
|
|
239 |
|
240 |
IE: Instead of using a q4KM, you might be able to run an IQ3_M and get close to Q4KM's quality, but at a higher token per second speed and have more VRAM for context.
|
241 |
|
242 |
+
---
|
243 |
+
|
244 |
+
Quick Reference Table
|
245 |
+
|
246 |
+
---
|
247 |
+
|
248 |
+
Compiled by: "EnragedAntelope"
|
249 |
+
|
250 |
+
https://huggingface.co/EnragedAntelope
|
251 |
+
|
252 |
+
https://github.com/EnragedAntelope
|
253 |
+
|
254 |
+
Please see sections below this for advanced usage, more details, settings notes etc etc.
|
255 |
+
|
256 |
+
<small>
|
257 |
+
<pre>
|
258 |
+
# LLM Parameters Reference Table
|
259 |
+
|
260 |
+
| Parameter | Description |
|
261 |
+
|
262 |
+
|----------- |-------------|
|
263 |
+
|
264 |
+
| **Primary Parameters** |
|
265 |
+
|
266 |
+
| temperature | Controls randomness of outputs (0 = deterministic, higher = more random). Range: 0-5 |
|
267 |
+
|
268 |
+
| top-p | Selects tokens with probabilities adding up to this number. Higher = more random results. Default: 0.9 |
|
269 |
+
|
270 |
+
| min-p | Discards tokens with probability smaller than this value × probability of most likely token. Default: 0.1 |
|
271 |
+
|
272 |
+
| top-k | Selects only top K most likely tokens. Higher = more possible results. Default: 40 |
|
273 |
+
|
274 |
+
|
275 |
+
| **Penalty Samplers** |
|
276 |
+
|
277 |
+
| repeat-last-n | Number of tokens to consider for penalties. Critical for preventing repetition. Default: 64 |
|
278 |
+
| repeat-penalty | Penalizes repeated token sequences. Range: 1.0-1.15. Default: 1.0 |
|
279 |
+
| presence-penalty | Penalizes token presence in previous text. Range: 0-0.2 for Class 3, 0.1-0.35 for Class 4 |
|
280 |
+
| frequency-penalty | Penalizes token frequency in previous text. Range: 0-0.25 for Class 3, 0.4-0.8 for Class 4 |
|
281 |
+
|
282 |
+
| penalize-nl | Penalizes newline tokens. Generally unused. Default: false |
|
283 |
+
|
284 |
+
|
285 |
+
| **Secondary Samplers** |
|
286 |
+
|
287 |
+
| mirostat | Controls perplexity during sampling. Modes: 0 (off), 1, or 2 |
|
288 |
+
| mirostat-lr | Mirostat learning rate. Default: 0.1 |
|
289 |
+
| mirostat-ent | Mirostat target entropy. Default: 5.0 |
|
290 |
+
|
291 |
+
| dynatemp-range | Range for dynamic temperature adjustment. Default: 0.0 |
|
292 |
+
| dynatemp-exp | Exponent for dynamic temperature scaling. Default: 1.0 |
|
293 |
+
|
294 |
+
| tfs | Tail free sampling - removes low-probability tokens. Default: 1.0 |
|
295 |
+
|
296 |
+
| typical | Selects tokens more likely than random given prior text. Default: 1.0 |
|
297 |
+
|
298 |
+
| xtc-probability | Probability of token removal. Range: 0-1 |
|
299 |
+
| xtc-threshold | Threshold for considering token removal. Default: 0.1 |
|
300 |
+
|
301 |
+
|
302 |
+
| **Advanced Samplers** |
|
303 |
+
|
304 |
+
| dry_multiplier | Controls DRY (Don't Repeat Yourself) intensity. Range: 0.8-1.12+ |
|
305 |
+
| dry_allowed_length | Allowed length for repeated sequences in DRY. Default: 2 |
|
306 |
+
| dry_base | Base value for DRY calculations. Range: 1.15-1.75+ for Class 4 |
|
307 |
+
|
308 |
+
| smoothing_factor | Quadratic sampling intensity. Range: 1-3 for Class 3, 3-5+ for Class 4 |
|
309 |
+
| smoothing_curve | Quadratic sampling curve. Range: 1 for Class 3, 1.5-2 for Class 4 |
|
310 |
+
|
311 |
+
|
312 |
+
## Notes
|
313 |
+
|
314 |
+
- For Class 3 and 4 models, using both DRY and Quadratic sampling is recommended
|
315 |
+
- Lower quants (Q2K, IQ1s, IQ2s) may require stronger settings due to compression damage
|
316 |
+
- Parameters interact with each other, so test changes one at a time
|
317 |
+
- Always test with temperature at 0 first to establish a baseline
|
318 |
+
|
319 |
+
</pre>
|
320 |
+
|
321 |
+
</small>
|
322 |
+
|
323 |
---
|
324 |
|
325 |
HOW TO TEST EACH PARAMETER(s), SAMPLER(s) and ADVANCED SAMPLER(s)
|