关于 tokenizer 文件

#17

by evpeople - opened 27 days ago

Discussion

evpeople

27 days ago

vLLM 建议GGUF模型使用底模的tokenizer

We recommend using the tokenizer from base model to avoid long-time and buggy tokenizer conversion.

但是qwen 7b的tokenizer 中没有做为special token，我正打算用deepseek-ai/DeepSeek-R1-Distill-Qwen-7B 试一下，但还是希望deepsex能提供tokenizer

evpeople

27 days ago

We recommend using the tokenizer from base model instead of GGUF model. Because the tokenizer conversion from GGUF is time-consuming and unstable, especially for some models with large vocab size.

HuanLin

23 days ago

理论上 tokenizer 和原模型一样，当然还是得问问 @ValueFX9507

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment