Mungert/Phi-4-mini-instruct.gguf · LM Studio - Failed to Load Model

15 days ago

I'm getting the following error in LM Studio when trying to load a model: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'gpt-4o'

Can you please help me understand why I'm getting this error in LM Studio when I try to load a model, and how I can fix it?

Mungert changed discussion status to closed 15 days ago

Mungert

Owner 15 days ago

•

edited 15 days ago

These gguf versions of the Phi-4-mini-instruct will only work with this build of llama.cpp https://github.com/ns3284/llama.cpp/tree/master . Follow the instructions on the github page (Building the project) to build llama.cpp . You will then need to replace version of llama.cpp that LMStudio is using.

See here : https://www.reddit.com/r/LocalLLaMA/comments/1h5h3lp/can_i_change_the_llamacpp_version_used_by_lm/

Or...

Here is some guidance when asking ChatGpt (Untested!) how to replace the llama.cpp version of LMStudio.

To use a specific version of llama.cpp with LM Studio, follow these steps:

Identify LM Studio's llama.cpp Location

LM Studio ships with a built-in version of llama.cpp. To override it, you need to replace its binaries.
Mac (macOS)

cd ~/Library/Application\ Support/LM\ Studio/backend/bin

Windows

cd "%AppData%\LM Studio\backend\bin"

Backup the Existing Binary

Before replacing, back up the current llama binary.
Mac/Linux

mv llama llama.bak

Windows (PowerShell)

ren llama.exe llama.bak

Build Your Desired Version of llama.cpp

Clone and compile your preferred version:

git clone https://github.com/ns3284/llama.cpp.git
cd llama.cpp
git checkout
make -j$(nproc)

On Windows, compile using mingw32-make or MSVC.
4. Replace LM Studio's llama Binary

After building, copy your compiled llama binary to LM Studio’s backend:
Mac/Linux

cp ./main ~/Library/Application\ Support/LM\ Studio/backend/bin/llama
chmod +x ~/Library/Application\ Support/LM\ Studio/backend/bin/llama

Windows

copy .\main.exe "%AppData%\LM Studio\backend\bin\llama.exe"

Restart LM Studio

Close LM Studio completely.
Reopen it, and it should now use your custom llama.cpp version.

Alternative: Use External llama.cpp

If you prefer, you can bypass LM Studio's internal llama.cpp and run your own instance manually:

./llama -m model.gguf --threads 8 --ctx-size 4096

Mungert changed discussion status to open 15 days ago

Mungert

Owner 14 days ago

Phi-4-mini-instruct has been added to main branch of llama.cpp . You should now be able to use this model with LMStudio if you enable "Beta" Runtime Extension Pack . See this for discussion "Beta" Runtime Extension Pack:

Mungert
/

Phi-4-mini-instruct.gguf

LM Studio - Failed to Load Model - Unknown Pre-tokenizer Type 'gpt-4o'

./llama -m model.gguf --threads 8 --ctx-size 4096