Edit Models filters

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

284

Full-text search

Active filters: llama.cpp

zhhan/Phi-3-mini-4k-instruct_gguf_derived

Summarization • 4B • Updated Jul 2, 2024 • 19

XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k-AWQ

Text Generation • 2B • Updated Jul 9, 2024 • 4

mgonzs13/stablelm-zephyr-3B-localmentor-GGUF

Text Generation • 3B • Updated Jul 3, 2024 • 11

google/gemma-2-2b-it-GGUF

3B • Updated Aug 27, 2024 • 97 • 83

google/gemma-2-2b-GGUF

3B • Updated Aug 2, 2024 • 69 • 16

akshathmangudi/llama3.1-8b-gguf

Updated Jul 26, 2024

dahara1/llama-translate-gguf

8B • Updated Aug 14, 2024 • 473 • 15

jhilburn/gemma-inference

Text Generation • Updated Aug 7, 2024

ghost-x/ghost-8b-beta-1608-gguf

Text Generation • 8B • Updated Aug 26, 2024 • 165 • 6

PaulJusst/codegemma-7b-it-GGUF

Text Generation • 9B • Updated Sep 13, 2024

TheCluster/Llama-3.2-3B-Instruct-GGUF

Text Generation • 3B • Updated Sep 25, 2024 • 13

v000000/Typhon-Mixtral-v1-imatrix-v2.Q6_K-GGUF

Updated Sep 26, 2024 • 1 • 1

LPN64/LongCite-llama3.1-8b-GGUF

Text Generation • 8B • Updated Oct 1, 2024 • 57 • 6

cstr/Ministral-8B-Instruct-2410-GGUF

8B • Updated Oct 17, 2024 • 9 • 1

mrcuddle/Lumimaid-v0.2-12B-Q4_K_M-GGUF

Text Generation • 12B • Updated Oct 20, 2024 • 2

Manel/Llama-3.1-8B-Instruct-Q4_K_M-GGUF

8B • Updated Nov 3, 2024 • 2

Manel/Llama-2-13b-chat-hf-Q4_0-GGUF

Text Generation • 13B • Updated Nov 3, 2024 • 360

dumb-dev/flan-t5-xxl-gguf

11B • Updated Oct 29, 2024 • 924 • 17

Manel/gemma-2-9b-Q4_0-GGUF

9B • Updated Nov 3, 2024 • 2

DiYaZeN/aya-sl-biz-8b

Text Generation • 8B • Updated Oct 31, 2024 • 2

shreyasmeher/ConflLlama

Text Classification • 8B • Updated Jul 8 • 50 • 3

dwikitheduck/gen-try1-Q4_K_M-GGUF

15B • Updated Nov 11, 2024

real-jiakai/Arxiver-Llama-GGUF

8B • Updated Nov 15, 2024 • 20

shreyasmeher/ConflLlama-Alt

Text Classification • 8B • Updated Nov 19, 2024 • 21 • 1

XeAI/LLaMa_3.2_3B_Instruct_Text2SQL-Q4_K_M-GGUF

Text Generation • 3B • Updated Nov 17, 2024 • 14

dwikitheduck/gen-sql-1-Q4_K_M-GGUF

8B • Updated Nov 18, 2024 • 1

jsjeon/SummLlama3.2-3B-Q4_K_M-GGUF

Updated Nov 19, 2024

dwikitheduck/gen-inst-1-Q4_K_M-GGUF

15B • Updated Nov 25, 2024

Vikhrmodels/Vikhr-Qwen-2.5-1.5B-Instruct-GGUF

2B • Updated Nov 26, 2024 • 175 • 4

McaTech/Nonet

Text Generation • 0.1B • Updated Jun 30 • 186 • 3