Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

9,637

Full-text search

Active filters: dpo

TheBloke/CapybaraHermes-2.5-Mistral-7B-GGUF

7B • Updated Jan 31, 2024 • 8.43k • 121

TheBloke/SauerkrautLM-Mixtral-8x7B-GGUF

Text Generation • 47B • Updated Dec 25, 2023 • 940 • 9

argilla/CapybaraHermes-2.5-Mistral-7B

7B • Updated Mar 4, 2024 • 17 • 70

mlabonne/NeuralDaredevil-8B-abliterated

Text Generation • 8B • Updated 16 days ago • 15k • • 220

QuantFactory/NeuralDaredevil-8B-abliterated-GGUF

Text Generation • 8B • Updated May 29, 2024 • 4.14k • 70

HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407

Text Generation • 12B • Updated Jan 13 • 99 • • 18

SmallDoge/Doge-320M-Instruct

Question Answering • 0.3B • Updated 3 days ago • 82 • 4

emretmrk/smolvlm-trl-dpo

Updated 7 days ago • 1

AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-DRIFT-iter1-4k

Text Generation • 0.0B • Updated 7 days ago • 15 • 1

mradermacher/Qwen2.5-14B-Instruct-wildfeedback-RPO-DRIFT-iter1-4k-GGUF

15B • Updated 7 days ago • 257 • 1

AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-DRIFT-iter2-4k

Text Generation • 0.0B • Updated 6 days ago • 6 • 1

mradermacher/Qwen2.5-14B-Instruct-wildfeedback-RPO-DRIFT-iter2-4k-GGUF

15B • Updated 6 days ago • 229 • 1

AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-spin-iter1-RPO

Text Generation • 0.0B • Updated 4 days ago • 13 • 1

mradermacher/Qwen2.5-14B-Instruct-ultrafeedback-spin-iter1-RPO-GGUF

15B • Updated 3 days ago • 259 • 1

AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-iterdpo-iter2-RPO

Text Generation • 0.0B • Updated 3 days ago • 4 • 1

AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-iterDPO-iter1-4k

Text Generation • 0.0B • Updated 3 days ago • 17 • 1

mradermacher/Qwen2.5-14B-Instruct-ultrafeedback-iterdpo-iter2-RPO-GGUF

15B • Updated 3 days ago • 2.07k • 1

mradermacher/Qwen2.5-14B-Instruct-wildfeedback-RPO-iterDPO-iter1-4k-GGUF

15B • Updated 1 day ago • 1.65k • 1

lyogavin/Anima33B-DPO-Belle-1k

Text Generation • Updated Jul 2, 2023 • 1

lyogavin/Anima33B-DPO-Belle-1k-merged

Text Generation • Updated Jul 2, 2023 • 14 • 12

daekeun-ml/Llama-2-ko-DPO-13B

Text Generation • 13B • Updated Oct 31, 2023 • 825 • 19

lewtun/zephyr-7b-dpo-full

Text Generation • 7B • Updated Jan 5, 2024 • 5

alignment-handbook/zephyr-7b-dpo-full

Text Generation • 7B • Updated Jan 10, 2024 • 69 • 3

alignment-handbook/zephyr-7b-dpo-qlora

Updated Jan 9, 2024 • 20 • 9

argilla/notus-7b-v1-lora

Text Generation • 7B • Updated Dec 4, 2023 • 5 • 7

argilla/notus-7b-v1-lora-adapter

Text Generation • Updated Dec 4, 2023 • 3

argilla/notus-7b-v1

Text Generation • 7B • Updated Dec 5, 2023 • 147 • 122

ContextualAI/archangel_sft_pythia1-4b

Text Generation • 1B • Updated Jan 11, 2024 • 7

ContextualAI/archangel_sft_pythia2-8b

Text Generation • 3B • Updated Jan 11, 2024 • 42 • 1

ContextualAI/archangel_sft_pythia6-9b

Text Generation • 7B • Updated Jan 11, 2024 • 5