It keeps spamming "assistant" at the end of sentences and then makes long rambling replies to itself

#2
by 6346y9uey - opened

Stopping string issue

I've seen "assistant" and repeat itself with llama3 quantized exl2 models with tabbyAPI.
Wait for downstream bug fixes for other UIs.

latest main results: main: build = 2709 (40f74e4d)

F16

== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.

<|begin_of_text|>
> repeat this: I live. 
I live.<|eot_id|>

> again.
I live.<|eot_id|>

>

IQ4_XS

<|begin_of_text|>
> repeat this once: I like soccer.
I like soccer.<|eot_id|>

> 

from my experience with some 70b l3 gguf quants, such issues disappear when using the full format with user/assistant format, which meta states is important for good results:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>

{{ user_message }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

absence or lack of new lines is also important. double newlines are not necessary ime, but that's how meta specified it

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment