3bit QPTQ quants for Mistral-Large-Instruct-2411

by dazipe - opened 15 days ago

15 days ago

Do you by any chance have the 3 bit quants for the newer version of Mistral-Large-Instruct-2411
I have 2 x MI100 with 64GB VRAM total and there are only GGUF quants available which would fit my setup.
But they are slow with vLLM especially in batched mode.

MLDataScientist

Owner 14 days ago

Hello @dazipe ,
Unfortunately, no. I only converted 2407 since it had better benchmark scores in HumanEval and live bench. I Also have 2xMI60 and I converted this model using vast.ai for $20. It takes around 20h to convert but the result and performance are great.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment