Spaces:

Ahmadzei
/

RAG

Runtime error

RAG

File size: 295 Bytes

57bdca5

You can save the quantized weights locally or push them to the Hub.
Make sure the package that contains the quantization kernels/primitive is stable (no frequent breaking changes).

For some quantization methods, they may require "pre-quantizing" the models through data calibration (e.g., AWQ).