google/gemma-2-9b-it, UQFF quantization

Run with mistral.rs. Documentation: UQFF docs.

  1. Flexible ๐ŸŒ€: Multiple quantization formats in one file format with one framework to run them all.
  2. Reliable ๐Ÿ”’: Compatibility ensured with embedded and checked semantic versioning information from day 1.
  3. Easy ๐Ÿค—: Download UQFF models easily and quickly from Hugging Face, or use a local file.
  4. Customizable ๐Ÿ› ๏ธ: Make and publish your own UQFF files in minutes.

Examples

Quantization type(s) Example
FP8 ./mistralrs-server -i plain -m EricB/gemma-2-9b-it-UQFF --from-uqff gemma2-9b-instruct-f8e4m3.uqff
HQQ4 ./mistralrs-server -i plain -m EricB/gemma-2-9b-it-UQFF --from-uqff gemma2-9b-instruct-hqq4.uqff
HQQ8 ./mistralrs-server -i plain -m EricB/gemma-2-9b-it-UQFF --from-uqff gemma2-9b-instruct-hqq8.uqff
Q3K ./mistralrs-server -i plain -m EricB/gemma-2-9b-it-UQFF --from-uqff gemma2-9b-instruct-q3k.uqff
Q4K ./mistralrs-server -i plain -m EricB/gemma-2-9b-it-UQFF --from-uqff gemma2-9b-instruct-q4k.uqff
Q5K ./mistralrs-server -i plain -m EricB/gemma-2-9b-it-UQFF --from-uqff gemma2-9b-instruct-q5k.uqff
Q8_0 ./mistralrs-server -i plain -m EricB/gemma-2-9b-it-UQFF --from-uqff gemma2-9b-instruct-q8_0.uqff
Downloads last month
11
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for EricB/gemma-2-9b-it-UQFF

Base model

google/gemma-2-9b
Quantized
(128)
this model

Collection including EricB/gemma-2-9b-it-UQFF