|
# Prompt Template |
|
|
|
The prompt template of XTuner ensures consistency with the LLMs' official templates. Below, we will elaborate on its logic using the example of InternLM-Chat model (`internlm_chat`). |
|
|
|
## Structure |
|
|
|
```python |
|
internlm_chat=dict( |
|
SYSTEM='<|System|>:{system}\n', |
|
INSTRUCTION='<|User|>:{input}<eoh>\n<|Bot|>:', |
|
SUFFIX='<eoa>', |
|
SUFFIX_AS_EOS=True, |
|
SEP='\n', |
|
STOP_WORDS=['<eoa>']) |
|
``` |
|
|
|
- `SYSTEM`: The template for the "system" field during Q&A, where `{system}` represents the "system" text. It's worth noting that this field only appears once in multi-turn dialogues, specifically in the first turn. |
|
|
|
- `INSTRUCTION`: The template for the "instruction" field during Q&A, where `{input}` represents the user instruction text. |
|
|
|
- `SUFFIX`: The suffix for the "instruction" field, which will be appended to the "response" of each Q&A turn. Typically, this also serves as a special ending symbol (*i.e.*, `eos`). Defaults to `''`. |
|
|
|
- `SUFFIX_AS_EOS`: Represents whether the aforementioned suffix acts as an ending symbol. If set to `True`, it will replace the `eos_token` of the `tokenizer`. Otherwise, the `eos_token` of the `tokenizer` will still be used to denote the end of sequence. Defaults to `False`. |
|
|
|
- `SEP`: Used to separate multi-turn dialogues, it will be appended after the `INSTRUCTION` and `SUFFIX`. Defaults to `''`. |
|
|
|
- `STOP_WORDS`: Used to specify the stop words, this information will be utilized during the text generation stage. It's worth noting that the `eos_token` of the `tokenizer` is automatically added to `STOP_WORDS`, without the need for manual setting. |
|
|
|
## Results |
|
|
|
**Single-turn** |
|
|
|
``` |
|
<|System|>:{system} |
|
<|User|>:{input}<eoh> |
|
<|Bot|>:{output}<eoa> |
|
``` |
|
|
|
**Multi-turn** |
|
|
|
``` |
|
<|System|>:{system} |
|
<|User|>:{input}<eoh> |
|
<|Bot|>:{output}<eoa> |
|
<|User|>:{input}<eoh> |
|
<|Bot|>:{output}<eoa> |
|
<|User|>:{input}<eoh> |
|
<|Bot|>:{output}<eoa> |
|
``` |
|
|
|
## Choosing the prompt template |
|
|
|
| Model | Prompt Template | |
|
| ---------------------------------------- | --------------- | |
|
| baichuan-inc/Baichuan-7B | default\* | |
|
| baichuan-inc/Baichuan-13B-Base | default\* | |
|
| baichuan-inc/Baichuan-13B-Chat | baichuan_chat | |
|
| baichuan-inc/Baichuan2-7B-Base | default\* | |
|
| baichuan-inc/Baichuan2-7B-Chat | baichuan2_chat | |
|
| baichuan-inc/Baichuan2-13B-Base | default\* | |
|
| baichuan-inc/Baichuan2-13B-Chat | baichuan2_chat | |
|
| THUDM/chatglm2-6b | chatglm2 | |
|
| THUDM/chatglm3-6b | chatglm3 | |
|
| THUDM/chatglm3-6b-base | chatglm3 | |
|
| deepseek-ai/deepseek-coder-6.7b-base | deepseek_coder | |
|
| deepseek-ai/deepseek-coder-6.7b-instruct | deepseek_coder | |
|
| internlm/internlm-7b | default\* | |
|
| internlm/internlm-20b | default\* | |
|
| internlm/internlm-chat-7b | internlm_chat | |
|
| internlm/internlm-chat-20b | internlm_chat | |
|
| huggyllama/llama-7b | default | |
|
| meta-llama/Llama-2-7b-hf | llama2_chat | |
|
| meta-llama/Llama-2-7b-chat-hf | llama2_chat | |
|
| meta-llama/Llama-2-70b-hf | llama2_chat | |
|
| lmsys/vicuna-7b-v1.5 | vicuna | |
|
| lmsys/vicuna-13b-v1.5 | vicuna | |
|
| mistralai/Mistral-7B-v0.1 | mistral | |
|
| mistralai/Mixtral-8x7B-v0.1 | mixtral | |
|
| mistralai/Mixtral-8x7B-Instruct-v0.1 | mixtral | |
|
| Qwen/Qwen-1_8B | default\* | |
|
| Qwen/Qwen-1_8B-Chat | qwen_chat | |
|
| Qwen/Qwen-7B | default\* | |
|
| Qwen/Qwen-7B-Chat | qwen_chat | |
|
| Qwen/Qwen-72B | default\* | |
|
| Qwen/Qwen-72B-Chat | qwen_chat | |
|
| bigcode/starcoder | default | |
|
| 01-ai/Yi-6B | default | |
|
| 01-ai/Yi-34B | default | |
|
| HuggingFaceH4/zephyr-7b-beta | zephyr | |
|
| deepseek-ai/deepseek-moe-16b-base | deepseek_moe | |
|
| deepseek-ai/deepseek-moe-16b-chat | deepseek_moe | |
|
| internlm/internlm2-1_8b | default\* | |
|
| internlm/internlm2-7b | default\* | |
|
| internlm/internlm2-20b | default\* | |
|
| internlm/internlm2-chat-1_8b | internlm2_chat | |
|
| internlm/internlm2-chat-7b | internlm2_chat | |
|
| internlm/internlm2-chat-20b | internlm2_chat | |
|
| Qwen/Qwen1.5-0.5B | default\* | |
|
| Qwen/Qwen1.5-0.5B-Chat | qwen_chat | |
|
| Qwen/Qwen1.5-1.8B | default\* | |
|
| Qwen/Qwen1.5-1.8B-Chat | qwen_chat | |
|
| Qwen/Qwen1.5-4B | default\* | |
|
| Qwen/Qwen1.5-4B-Chat | qwen_chat | |
|
| Qwen/Qwen1.5-7B | default\* | |
|
| Qwen/Qwen1.5-7B-Chat | qwen_chat | |
|
| Qwen/Qwen1.5-14B | default\* | |
|
| Qwen/Qwen1.5-14B-Chat | qwen_chat | |
|
| Qwen/Qwen1.5-72B | default\* | |
|
| Qwen/Qwen1.5-72B-Chat | qwen_chat | |
|
| google/gemma-2b | default\* | |
|
| google/gemma-2b-it | gemma\* | |
|
| google/gemma-7b | default\* | |
|
| google/gemma-7b-it | gemma\* | |
|
|
|
\*: The official template has special tokens (like `<|im_start|>`, `<|im_end|>`) that were not trained during the pre-training phase. Therefore, these models utilize the `default` template. |
|
|