Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

arxiv: 2210.17323

AutoTrain Compatible

text-generation-inference

Inference Endpoints

4-bit precision

8-bit precision

Carbon Emissions

Misc with no match

text-embeddings-inference

Mixture of Experts

Models

156

Full-text search

Active filters: 2210.17323

malhajar/Platypus2-70B-instruct-4bit-gptq

Text Generation • Updated Jan 29, 2024 • 1.91k

clibrain/Llama-2-7b-ft-instruct-es-gptq-4bit

Text Generation • Updated Sep 1, 2023 • 60 • 9

clibrain/Llama-2-13b-ft-instruct-es-gptq-4bit

Text Generation • Updated Sep 4, 2023 • 21 • 3

daedalus314/Griffin-3B-GPTQ

Text Generation • Updated Sep 8, 2023 • 12

daedalus314/Marx-3B-V2-GPTQ

Text Generation • Updated Oct 12, 2023 • 65

TRAC-MTRY/traclm-v2-7b-instruct-GPTQ

Text Generation • Updated Dec 22, 2023

iproskurina/bloom-1b7-GPTQ-4bit-g128

Text Generation • Updated Sep 24, 2024 • 11

iproskurina/bloom-3b-GPTQ-4bit-g128

Text Generation • Updated Sep 24, 2024 • 42

iproskurina/bloom-560m-GPTQ-4bit-g128

Text Generation • Updated Sep 24, 2024 • 8

iproskurina/bloom-1b1-GPTQ-4bit-g128

Text Generation • Updated Sep 24, 2024 • 22

iproskurina/bloom-7b1-GPTQ-4bit-g128

Text Generation • Updated Sep 24, 2024 • 32 • 2

iproskurina/opt-350m-GPTQ-4bit-g128

Text Generation • Updated Sep 24, 2024 • 14

iproskurina/opt-1.3b-GPTQ-4bit-g128

Text Generation • Updated Sep 24, 2024 • 20

iproskurina/opt-2.7b-GPTQ-4bit-g128

Text Generation • Updated Sep 24, 2024 • 56

iproskurina/opt-6.7b-GPTQ-4bit-g128

Text Generation • Updated Sep 24, 2024 • 37

iproskurina/opt-13b-GPTQ-4bit-g128

Text Generation • Updated Sep 24, 2024 • 19

neuralmagic/zephyr-7b-beta-marlin

Text Generation • Updated Mar 6, 2024 • 199

neuralmagic/TinyLlama-1.1B-Chat-v1.0-marlin

Text Generation • Updated Mar 6, 2024 • 4.59k • 1

neuralmagic/OpenHermes-2.5-Mistral-7B-marlin

Text Generation • Updated Mar 6, 2024 • 787 • 2

neuralmagic/Nous-Hermes-2-Yi-34B-marlin

Text Generation • Updated Mar 6, 2024 • 11 • 5

softmax/Llama-2-70b-chat-hf-marlin

Text Generation • Updated Mar 17, 2024 • 12

softmax/falcon-180B-chat-marlin

Text Generation • Updated Mar 21, 2024 • 8

smpanaro/Llama-2-7b-NuGPTQ

Text Generation • Updated Oct 12, 2024 • 14 • 1

TRAC-MTRY/traclm-v3-7b-instruct-GPTQ

Text Generation • Updated May 2, 2024

astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit

Text Generation • Updated Apr 22, 2024 • 495 • 25

astronomer/Llama-3-8B-GPTQ-8-Bit

Text Generation • Updated Apr 22, 2024 • 64 • 2

astronomer/Llama-3-8B-GPTQ-4-Bit

Text Generation • Updated Apr 22, 2024 • 228 • 6

SwastikM/Llama-2-7B-Chat-text2code

Text Generation • Updated May 19, 2024 • 52 • 4

davidxmle/Llama-3-8B-Instruct-GPTQ-4-Bit-Debug

Text Generation • Updated Apr 30, 2024 • 43

drbh/flash-attention-pre-compile

Updated May 29, 2024