Disable Think text

#11

by rjsng0904 - opened 15 days ago

15 days ago

Hi, I am trying to use this model for a special case, is there any way to disable "Think" text generation, and will it result in faster inference. Thanks

ZhafranR

13 days ago

i dont think its possible to disable thinking generation as its the way that the model work, they spit out first and then from there it will output the actual respond that you use

oflakne26

12 days ago

Include an empty CoT in the prompt for text generation, like so:

<| User |>Hello!<| Assistant |><think>\n\n</think>\n\n

MIragE-Ala

1 day ago

Hello, I have a question. Why not place <think>\n\n</think> before <| Assistant |>, like this:

<| User |>Hello!<think>\n\n</think>\n\n<| Assistant |>

I'm curious about the difference between these two approaches, but I'm not sure if there are any official recommendations or references on this. Thank you!

oflakne26

about 16 hours ago

•

edited about 16 hours ago

The assistant uses the think-tags, not the user. The <| Assistant |> token acts as a label for conversation turns in the text generation, so that the model understands who is currently replying to whom.

According to the chat template, <think> and </think> belong in the assistant's domain. This is because the model was trained using reinforcement learning (RL) and one of the rewarded factors was its use of reasoning between think-tags during its conversation turns.

When an empty CoT is included in the prompt for text generation...

<| User |>Hello!<| Assistant |><think>\n\n</think>\n\n

...you're actually mimicking part of the model's response.

Now, the model is tricked into believing that it had already typed its own CoT, and it will continue therefrom. This prevents the model from generating a new CoT because it sees that one is already present, and no extra tokens are wasted.

Note: Wrappers like Ollama and LM Studio may not allow you to generate from the raw chat template, as these tokens are automatically added by the software. For this to work, you may have to use your own Python (of which examples are present on the model card), or understand advanced features in your application of choice.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment