Edit Models filters

Inference Providers

HF Inference API

Misc

arxiv: 2405.03594

AutoTrain Compatible

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

Misc with no match

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

31

Full-text search

Active filters: 2405.03594

neuralmagic/Llama-2-7b-pruned50-retrained

Text Generation • Updated May 7, 2024 • 64

neuralmagic/Llama-2-7b-pruned70-retrained

Text Generation • Updated May 7, 2024 • 97

neuralmagic/Llama-2-7b-ultrachat200k

Text Generation • Updated May 7, 2024 • 1.18k

neuralmagic/Llama-2-7b-ultrachat200k-pruned_50

Text Generation • Updated May 15, 2024 • 20

neuralmagic/Llama-2-7b-ultrachat200k-pruned_70

Text Generation • Updated May 15, 2024 • 31

neuralmagic/Llama-2-7b-ultrachat200k-pruned_50-quantized-deepsparse

Text Generation • Updated May 7, 2024 • 17

neuralmagic/Llama-2-7b-ultrachat200k-pruned_70-quantized-deepsparse

Text Generation • Updated May 15, 2024 • 17

neuralmagic/Llama-2-7b-evolcodealpaca

Text Generation • Updated May 7, 2024 • 24 • 1

neuralmagic/Llama-2-7b-evol-code-alpaca-pruned_50

Text Generation • Updated May 15, 2024 • 21

neuralmagic/Llama-2-7b-evol-code-alpaca-pruned_70

Text Generation • Updated May 15, 2024 • 19

neuralmagic/Llama-2-7b-evol-code-alpaca-pruned_50-quantized-deepsparse

Text Generation • Updated May 15, 2024 • 15

neuralmagic/Llama-2-7b-evol-code-alpaca-pruned_70-quantized-deepsparse

Text Generation • Updated May 15, 2024 • 14

neuralmagic/Llama-2-7b-dolphin-open_platypus

Text Generation • Updated May 15, 2024 • 12

neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_50

Text Generation • Updated May 15, 2024 • 22

neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_70

Text Generation • Updated May 15, 2024 • 19

neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_50-quantized-deepsparse

Text Generation • Updated May 16, 2024 • 17

neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_70-quantized-deepsparse

Text Generation • Updated May 16, 2024 • 11 • 1

RichardErkhov/neuralmagic_-_Llama-2-7b-evolcodealpaca-4bits

Text Generation • Updated May 10, 2024 • 80

RichardErkhov/neuralmagic_-_Llama-2-7b-evolcodealpaca-8bits

Text Generation • Updated May 10, 2024 • 80

RichardErkhov/neuralmagic_-_Llama-2-7b-evolcodealpaca-gguf

Updated May 10, 2024 • 46

neuralmagic/Llama-2-7b-gsm8k-pruned_50

Text Generation • Updated Jun 20, 2024 • 17 • 1

neuralmagic/Llama-2-7b-gsm8k-pruned_70

Text Generation • Updated Jun 20, 2024 • 10

neuralmagic/Llama-2-7b-gsm8k

Text Generation • Updated Jun 20, 2024 • 105 • 3

RichardErkhov/neuralmagic_-_Llama-2-7b-dolphin-open_platypus-pruned_70-gguf

Updated Jul 16, 2024 • 20

RichardErkhov/neuralmagic_-_Llama-2-7b-pruned50-retrained-gguf

Updated Sep 13, 2024 • 62

RichardErkhov/neuralmagic_-_Llama-2-7b-ultrachat200k-gguf

Updated Sep 13, 2024 • 7

RichardErkhov/neuralmagic_-_Llama-2-7b-pruned70-retrained-gguf

Updated Nov 17, 2024

neuralmagic/Sparse-Llama-3.1-8B-ultrachat_200k-2of4-FP8-dynamic

Text Generation • Updated Dec 19, 2024 • 41 • 1

neuralmagic/Sparse-Llama-3.1-8B-ultrachat_200k-2of4-quantized.w4a16

Text Generation • Updated Dec 19, 2024 • 130 • 3

neuralmagic/Sparse-Llama-3.1-8B-ultrachat_200k-2of4

Text Generation • Updated Nov 21, 2024 • 24 • 1