Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference Providers
Select all
Cerebras
Together AI
Replicate
Fireworks
SambaNova
Nebius AI Studio
fal
Hyperbolic
Novita
HF Inference API
Misc
Reset Misc
arxiv:
2405.07863
Inference Endpoints
AutoTrain Compatible
text-generation-inference
Misc with no match
Eval Results
Merge
4-bit precision
8-bit precision
custom_code
text-embeddings-inference
Carbon Emissions
Mixture of Experts
Apply filters
Models
33
Full-text search
Edit filters
Sort: Trending
Active filters:
2405.07863
Clear all
sfairXC/FsfairX-LLaMA3-RM-v0.1
Text Classification
•
Updated
Oct 14, 2024
•
3.83k
•
55
Salesforce/LLaMA-3-8B-SFR-Iterative-DPO-R
Text Generation
•
Updated
Jan 21
•
126
•
78
Salesforce/LLaMA-3-8B-SFR-SFT-R
Text Generation
•
Updated
Jan 21
•
38
•
8
RLHFlow/LLaMA3-SFT
Text Generation
•
Updated
Nov 3, 2024
•
652
•
10
sfairXC/FsfairX-Gemma2-RM-v0.1
Text Classification
•
Updated
Jul 9, 2024
•
39
•
7
RLHFlow/Qwen2.5-7B-PPO-Zero
Updated
25 days ago
•
156
•
2
RLHFlow/pair-preference-model-LLaMA3-8B
Text Generation
•
Updated
Oct 14, 2024
•
240
•
38
Salesforce/LLaMA-3-8B-SFR-RM-R
Text Classification
•
Updated
Jan 21
•
31
•
11
qwp4w3hyb/SFR-Iterative-DPO-LLaMA-3-8B-R-iMat-GGUF
Text Generation
•
Updated
May 16, 2024
•
116
•
2
RLHFlow/LLaMA3-iterative-DPO-final
Text Generation
•
Updated
Oct 14, 2024
•
3.2k
•
40
TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-8B-R
Text Generation
•
Updated
May 24, 2024
•
4.31k
•
1
sirovub/SFR-Iterative-DPO-LLaMA-3-8B-R-GGUF
Text Generation
•
Updated
May 26, 2024
•
97
•
1
Apel-sin/llama-3-8B-iterative-DPO-final-exl2
Updated
May 25, 2024
•
6
•
1
QuantFactory/pair-preference-model-LLaMA3-8B-GGUF
Text Generation
•
Updated
May 26, 2024
•
221
•
1
thesven/SFR-Iterative-DPO-LLaMA-3-8B-R-GGUF
Updated
Jul 8, 2024
•
284
•
1
sirovub/LLaMA3-iterative-DPO-final-GGUF
Text Generation
•
Updated
May 26, 2024
•
38
•
1
OpenRLHF/Llama-3-8b-sft-mixture
Text Generation
•
Updated
Jun 14, 2024
•
20.5k
•
1
QuantFactory/LLaMA-3-8B-SFR-Iterative-DPO-R-GGUF
Text Generation
•
Updated
Jun 19, 2024
•
166
•
1
QuantFactory/LLaMA-3-8B-SFR-SFT-R-GGUF
Text Generation
•
Updated
Jun 19, 2024
•
134
•
1
RichardErkhov/RLHFlow_-_pair-preference-model-LLaMA3-8B-gguf
Updated
Aug 19, 2024
•
186
RichardErkhov/Salesforce_-_LLaMA-3-8B-SFR-Iterative-DPO-R-gguf
Updated
Aug 21, 2024
•
477
RichardErkhov/TriAiExperiments_-_SFR-Iterative-DPO-LLaMA-3-8B-R-gguf
Updated
Aug 21, 2024
•
166
RichardErkhov/OpenRLHF_-_Llama-3-8b-sft-mixture-gguf
Updated
Aug 22, 2024
•
89
RLHFlow/LLaMA3-SFT-v2
Text Generation
•
Updated
Nov 3, 2024
•
1.05k
•
2
RichardErkhov/RLHFlow_-_LLaMA3-SFT-gguf
Updated
Oct 8, 2024
•
81
RichardErkhov/RLHFlow_-_LLaMA3-iterative-DPO-final-gguf
Updated
Oct 8, 2024
•
227
RLHFlow/Llama3-SFT-v2.0-epoch1
Text Generation
•
Updated
Nov 3, 2024
•
10
RLHFlow/Llama3-SFT-v2.0-epoch2
Text Generation
•
Updated
Nov 3, 2024
•
12
RLHFlow/Llama3-SFT-v2.0-epoch3
Text Generation
•
Updated
Nov 3, 2024
•
229
RLHFlow/Qwen2.5-7B-DPO-Zero
Updated
25 days ago
•
60
Previous
1
2
Next