3bit QPTQ quants for Mistral-Large-Instruct-2411
#1
by
dazipe
- opened
Do you by any chance have the 3 bit quants for the newer version of Mistral-Large-Instruct-2411
I have 2 x MI100 with 64GB VRAM total and there are only GGUF quants available which would fit my setup.
But they are slow with vLLM especially in batched mode.
Hello
@dazipe
,
Unfortunately, no. I only converted 2407 since it had better benchmark scores in HumanEval and live bench. I Also have 2xMI60 and I converted this model using vast.ai for $20. It takes around 20h to convert but the result and performance are great.