LLM Model, Can I run it?

To support gated or private repos, you need to create an authentification token, to check the box "Read access to contents of all public gated repos you can access" and then enter the token in the field below.

Huggingface Token (optional)

GPU (optional)

Model (unquantized)

Context Size

Quant Format

GGUF

EXL2

GPTQ (coming soon)

BPW

KV Cache

Quantization Size

Batch Size

Model Size (GB)

4.20

Context Size (GB)

6.90

Total Size (GB)

420.69