File size: 295 Bytes
57bdca5
 
 
 
1
2
3
4
You can save the quantized weights locally or push them to the Hub.
Make sure the package that contains the quantization kernels/primitive is stable (no frequent breaking changes).

For some quantization methods, they may require "pre-quantizing" the models through data calibration (e.g., AWQ).