Perplexity

#22
by gsaivinay - opened

Hello Mr. Bloke,

I assume you must be busy, but I'd like to ask if you have made any performance comparisons for Llama2 70B chat models vs GPTQ quant, like you did here . I'm just curious to know what is the performance of bigger quantized models vs fp16.

No worries if this is not in your radar.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment