LM Studio - Failed to Load Model - Unknown Pre-tokenizer Type 'gpt-4o'
@Mungert , Hello!
I'm getting the following error in LM Studio when trying to load a model: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'gpt-4o'
Can you please help me understand why I'm getting this error in LM Studio when I try to load a model, and how I can fix it?
These gguf versions of the Phi-4-mini-instruct will only work with this build of llama.cpp https://github.com/ns3284/llama.cpp/tree/master . Follow the instructions on the github page (Building the project) to build llama.cpp . You will then need to replace version of llama.cpp that LMStudio is using.
See here : https://www.reddit.com/r/LocalLLaMA/comments/1h5h3lp/can_i_change_the_llamacpp_version_used_by_lm/
Or...
Here is some guidance when asking ChatGpt (Untested!) how to replace the llama.cpp version of LMStudio.
To use a specific version of llama.cpp with LM Studio, follow these steps:
- Identify LM Studio's llama.cpp Location
LM Studio ships with a built-in version of llama.cpp. To override it, you need to replace its binaries.
Mac (macOS)
cd ~/Library/Application\ Support/LM\ Studio/backend/bin
Windows
cd "%AppData%\LM Studio\backend\bin"
- Backup the Existing Binary
Before replacing, back up the current llama binary.
Mac/Linux
mv llama llama.bak
Windows (PowerShell)
ren llama.exe llama.bak
- Build Your Desired Version of llama.cpp
Clone and compile your preferred version:
git clone https://github.com/ns3284/llama.cpp.git
cd llama.cpp
git checkout
make -j$(nproc)
On Windows, compile using mingw32-make or MSVC.
4. Replace LM Studio's llama Binary
After building, copy your compiled llama binary to LM Studio’s backend:
Mac/Linux
cp ./main ~/Library/Application\ Support/LM\ Studio/backend/bin/llama
chmod +x ~/Library/Application\ Support/LM\ Studio/backend/bin/llama
Windows
copy .\main.exe "%AppData%\LM Studio\backend\bin\llama.exe"
Restart LM Studio
Close LM Studio completely.
Reopen it, and it should now use your custom llama.cpp version.
Alternative: Use External llama.cpp
If you prefer, you can bypass LM Studio's internal llama.cpp and run your own instance manually:
./llama -m model.gguf --threads 8 --ctx-size 4096
Phi-4-mini-instruct has been added to main branch of llama.cpp . You should now be able to use this model with LMStudio if you enable "Beta" Runtime Extension Pack . See this for discussion "Beta" Runtime Extension Pack: