Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

AutoTrain Compatible

text-generation-inference

4-bit precision

Inference Endpoints

8-bit precision

Mixture of Experts

text-embeddings-inference

Misc with no match

Carbon Emissions

Models

5,451

Full-text search

Active filters: gptq

pentagoniac/SEMIKONG-8b-GPTQ

Text Generation • Updated Jul 9, 2024 • 768 • 27

ChenMnZ/Llama-2-7b-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22, 2024 • 80 • 1

shuyuej/Mixtral-8x22B-Instruct-v0.1-GPTQ

Text Generation • Updated Jul 25, 2024 • 8 • 1

ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit

Text Generation • Updated Jul 29, 2024 • 1.45k • 4

shuyuej/Mistral-Nemo-Instruct-2407-GPTQ

Text Generation • Updated Jul 25, 2024 • 12k • 5

hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4

Text Generation • Updated Aug 7, 2024 • 5.35k • 23

hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4

Text Generation • Updated Aug 7, 2024 • 12.5k • 22

neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16

Text Generation • Updated Dec 17, 2024 • 133k • 24

IntelLabs/sqft-qa-sparsepeft-mistral-7b-v0.3-50-gptq-gsm8k-heu

Text Generation • Updated 29 days ago • 316 • 2

IntelLabs/sqft-qa-sparsepeft-mistral-7b-v0.3-50-gptq-math-heu

Text Generation • Updated 29 days ago • 174 • 3

IntelLabs/sqft-qa-sparsepeft-phi-3-mini-4k-50-gptq-math-heu

Text Generation • Updated 29 days ago • 175 • 2

IntelLabs/sqft-qa-sparsepeft-phi-3-mini-4k-50-gptq-cs-heu

Text Generation • Updated 29 days ago • 448 • 2

Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int4

Image-Text-to-Text • Updated Sep 21, 2024 • 54.4k • 35

Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8

Image-Text-to-Text • Updated Sep 21, 2024 • 3.82k • 29

Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int4

Image-Text-to-Text • Updated Sep 21, 2024 • 6.63k • 24

Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int8

Image-Text-to-Text • Updated Sep 21, 2024 • 1.5k • 14

alexwww94/glm-4v-9b-gptq-4bit

Updated Sep 9, 2024 • 245 • 7

Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4

Image-Text-to-Text • Updated Sep 24, 2024 • 78.8k • 27

Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int8

Image-Text-to-Text • Updated Sep 24, 2024 • 1.88k • 11

Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int4

Text Generation • Updated Sep 19, 2024 • 2.52k • 6

Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9, 2024 • 2.36k • 9

Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int4

Text Generation • Updated Oct 9, 2024 • 2.31k • 1

Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9, 2024 • 737 • 3

Qwen/Qwen2.5-3B-Instruct-GPTQ-Int4

Text Generation • Updated Oct 9, 2024 • 49.4k • 2

Qwen/Qwen2.5-3B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9, 2024 • 1.09k • 3

Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4

Text Generation • Updated Oct 18, 2024 • 50.1k • 17

Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 18, 2024 • 29.4k • 14

Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4

Text Generation • Updated Oct 9, 2024 • 69.3k • 16

Qwen/Qwen2.5-14B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9, 2024 • 21.9k • 17

Qwen/Qwen2.5-32B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9, 2024 • 97.2k • 10