AI & ML interests

None defined yet.

Recent Activity

inference-optimization 's collections 4

Qwen3-Next-80B-A3B Quantized Models
FP8-dynamic, FP8-block, NVFP4, INT4, INT8 versions of Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking Models
Mixed Precision Models
Collection of Mixed Precision LLaMA and Qwen Models
KV Cache Quantization
Collection on FP8 Quantization of Weights, Activations and KV Cache