Edit Models filters

Inference Providers

HF Inference API

Misc

AutoTrain Compatible

text-generation-inference

4-bit precision

Inference Endpoints

8-bit precision

Mixture of Experts

text-embeddings-inference

Misc with no match

Carbon Emissions

Models

5,183

Full-text search

Active filters: gptq

TheBloke/saiga_mistral_7b-GPTQ

Text Generation • Updated Nov 28, 2023 • 386 • 8

TheBloke/deepseek-llm-67b-chat-GPTQ

Text Generation • Updated Nov 29, 2023 • 74 • 7

TheBloke/deepseek-llm-7B-chat-GPTQ

Text Generation • Updated Nov 30, 2023 • 577 • 1

Pi3141/alpaca-7b-native-enhanced-GPTQ

Text Generation • Updated Dec 3, 2023 • 2

TheBloke/dolphin-2.5-mixtral-8x7b-GPTQ

Text Generation • Updated Dec 14, 2023 • 161 • 110

TheBloke/GEITje-7B-chat-GPTQ

Text Generation • Updated Dec 19, 2023 • 36 • 4

astronomer/Llama-3-8B-Instruct-GPTQ-4-Bit

Text Generation • Updated Apr 22, 2024 • 8.86k • 25

neuralmagic/Mistral-7B-Instruct-v0.3-GPTQ-4bit

Text Generation • Updated Jun 10, 2024 • 1.1k • 18

allganize/Llama-3-Alpha-Ko-8B-Instruct-marlin

Text Generation • Updated May 24, 2024 • 19 • 5

Qwen/Qwen2-7B-Instruct-GPTQ-Int4

Text Generation • Updated Aug 21, 2024 • 1.97k • 24

neuralmagic/Meta-Llama-3-70B-Instruct-quantized.w8a16

Text Generation • Updated Jul 18, 2024 • 493 • 4

AI-MO/NuminaMath-7B-TIR-GPTQ

Text Generation • Updated Jul 9, 2024 • 301 • 7

pentagoniac/SEMIKONG-8b-GPTQ

Text Generation • Updated Jul 9, 2024 • 859 • 26

ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit

Text Generation • Updated Jul 29, 2024 • 1.38k • 4

shuyuej/Mistral-Nemo-Instruct-2407-GPTQ

Text Generation • Updated Jul 25, 2024 • 1.33k • 5

neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16

Text Generation • Updated Dec 17, 2024 • 488k • 24

neuralmagic/Meta-Llama-3.1-70B-Instruct-quantized.w4a16

Text Generation • Updated Oct 10, 2024 • 21.6k • 30

Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int4

Image-Text-to-Text • Updated Sep 21, 2024 • 133k • 32

Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8

Image-Text-to-Text • Updated Sep 21, 2024 • 6.5k • 26

Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int8

Image-Text-to-Text • Updated Sep 21, 2024 • 4.05k • 13

Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4

Image-Text-to-Text • Updated Sep 24, 2024 • 173k • 23

Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int8

Image-Text-to-Text • Updated Sep 24, 2024 • 2.05k • 10

Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9, 2024 • 37.3k • 8

Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4

Text Generation • Updated Oct 18, 2024 • 11.9k • 14

Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 18, 2024 • 11.6k • 12

Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4

Text Generation • Updated Oct 9, 2024 • 12.9k • 14

Qwen/Qwen2.5-32B-Instruct-GPTQ-Int4

Text Generation • Updated Oct 9, 2024 • 70.9k • 24

Qwen/Qwen2.5-72B-Instruct-GPTQ-Int4

Text Generation • Updated Oct 9, 2024 • 24.3k • 32

Qwen/Qwen2.5-72B-Instruct-GPTQ-Int8

Text Generation • Updated Oct 9, 2024 • 4.83k • 18

Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4

Text Generation • Updated Nov 18, 2024 • 8.78k • 4