How to quantize the hunyuan model to fp8

by hz094 - opened Dec 25, 2024

Discussion

hz094

Dec 25, 2024

Hi sir, Thank for the excellent work, I am curious about how you quantize the hunyuan model, may you show more details?

calcuis

Owner Dec 25, 2024

•

edited Jan 27

you need torch and llama.cpp; could try to convert the safetensors to gguf and test it first; simply execute: ggc t

actually, if you just want fp8, the updated node has a tool - tensor cutter, which will help you make your own fp8 scaled model (50% decreased in file size) in an easy way; you don't need llama.cpp or any extra dependency in that case

calcuis changed discussion status to closed Jan 27

calcuis changed discussion status to open Jan 27

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment