Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

arxiv: 2405.07863

Inference Endpoints

AutoTrain Compatible

text-generation-inference

Misc with no match

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

33

Full-text search

Active filters: 2405.07863

sfairXC/FsfairX-LLaMA3-RM-v0.1

Text Classification • Updated Oct 14, 2024 • 3.83k • 55

Salesforce/LLaMA-3-8B-SFR-Iterative-DPO-R

Text Generation • Updated Jan 21 • 126 • 78

Salesforce/LLaMA-3-8B-SFR-SFT-R

Text Generation • Updated Jan 21 • 38 • 8

RLHFlow/LLaMA3-SFT

Text Generation • Updated Nov 3, 2024 • 652 • 10

sfairXC/FsfairX-Gemma2-RM-v0.1

Text Classification • Updated Jul 9, 2024 • 39 • 7

RLHFlow/Qwen2.5-7B-PPO-Zero

Updated 25 days ago • 158 • 2

RLHFlow/pair-preference-model-LLaMA3-8B

Text Generation • Updated Oct 14, 2024 • 240 • 38

Salesforce/LLaMA-3-8B-SFR-RM-R

Text Classification • Updated Jan 21 • 31 • 11

qwp4w3hyb/SFR-Iterative-DPO-LLaMA-3-8B-R-iMat-GGUF

Text Generation • Updated May 16, 2024 • 116 • 2

RLHFlow/LLaMA3-iterative-DPO-final

Text Generation • Updated Oct 14, 2024 • 3.2k • 40

TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-8B-R

Text Generation • Updated May 24, 2024 • 4.31k • 1

sirovub/SFR-Iterative-DPO-LLaMA-3-8B-R-GGUF

Text Generation • Updated May 26, 2024 • 97 • 1

Apel-sin/llama-3-8B-iterative-DPO-final-exl2

Updated May 25, 2024 • 6 • 1

QuantFactory/pair-preference-model-LLaMA3-8B-GGUF

Text Generation • Updated May 26, 2024 • 221 • 1

thesven/SFR-Iterative-DPO-LLaMA-3-8B-R-GGUF

Updated Jul 8, 2024 • 284 • 1

sirovub/LLaMA3-iterative-DPO-final-GGUF

Text Generation • Updated May 26, 2024 • 38 • 1

OpenRLHF/Llama-3-8b-sft-mixture

Text Generation • Updated Jun 14, 2024 • 20.5k • 1

QuantFactory/LLaMA-3-8B-SFR-Iterative-DPO-R-GGUF

Text Generation • Updated Jun 19, 2024 • 166 • 1

QuantFactory/LLaMA-3-8B-SFR-SFT-R-GGUF

Text Generation • Updated Jun 19, 2024 • 134 • 1

RichardErkhov/RLHFlow_-_pair-preference-model-LLaMA3-8B-gguf

Updated Aug 19, 2024 • 167

RichardErkhov/Salesforce_-_LLaMA-3-8B-SFR-Iterative-DPO-R-gguf

Updated Aug 21, 2024 • 461

RichardErkhov/TriAiExperiments_-_SFR-Iterative-DPO-LLaMA-3-8B-R-gguf

Updated Aug 21, 2024 • 150

RichardErkhov/OpenRLHF_-_Llama-3-8b-sft-mixture-gguf

Updated Aug 22, 2024 • 89

RLHFlow/LLaMA3-SFT-v2

Text Generation • Updated Nov 3, 2024 • 895 • 2

RichardErkhov/RLHFlow_-_LLaMA3-SFT-gguf

Updated Oct 8, 2024 • 81

RichardErkhov/RLHFlow_-_LLaMA3-iterative-DPO-final-gguf

Updated Oct 8, 2024 • 205

RLHFlow/Llama3-SFT-v2.0-epoch1

Text Generation • Updated Nov 3, 2024 • 10

RLHFlow/Llama3-SFT-v2.0-epoch2

Text Generation • Updated Nov 3, 2024 • 11

RLHFlow/Llama3-SFT-v2.0-epoch3

Text Generation • Updated Nov 3, 2024 • 172

RLHFlow/Qwen2.5-7B-DPO-Zero

Updated 25 days ago • 60