Ahmadzei's picture
update 1
57bdca5
raw
history blame contribute delete
295 Bytes
You can save the quantized weights locally or push them to the Hub.
Make sure the package that contains the quantization kernels/primitive is stable (no frequent breaking changes).
For some quantization methods, they may require "pre-quantizing" the models through data calibration (e.g., AWQ).