Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference Providers
Select all
Cerebras
Hyperbolic
Nebius AI Studio
Fireworks
fal
Novita
Replicate
Together AI
SambaNova
HF Inference API
Misc
Reset Misc
arxiv:
2501.02790
AutoTrain Compatible
Inference Endpoints
text-generation-inference
custom_code
Misc with no match
Eval Results
Merge
4-bit precision
8-bit precision
text-embeddings-inference
Carbon Emissions
Mixture of Experts
Apply filters
Models
18
Full-text search
Edit filters
Sort: Trending
Active filters:
2501.02790
Clear all
yyqoni/Phi-3-mini-4k-instruct-segment-rm-700k
Text Classification
•
Updated
Jan 8
•
34
yyqoni/Phi-3-mini-4k-instruct-token-rm-700k
Text Classification
•
Updated
Jan 8
•
19
yyqoni/Phi-3-mini-4k-instruct-bandit-rm-700k
Text Classification
•
Updated
Jan 8
•
24
yyqoni/rlhflow-llama-3-sft-8b-v2-segment-rm-700k
Text Classification
•
Updated
Jan 8
•
16
yyqoni/rlhflow-llama-3-sft-8b-v2-token-rm-700k
Text Classification
•
Updated
Jan 8
•
10
yyqoni/rlhflow-llama-3-sft-8b-v2-bandit-rm-700k
Text Classification
•
Updated
Jan 8
•
16
yyqoni/meta-llama-3.1-instruct-8b-token-rm-700k
Text Classification
•
Updated
Jan 8
•
9
yyqoni/meta-llama-3.1-instruct-8b-bandit-rm-700k
Text Classification
•
Updated
Jan 8
•
19
yyqoni/meta-llama-3.1-instruct-8b-segment-rm-700k
Text Classification
•
Updated
Jan 8
•
9
yyqoni/rlhflow-llama-3-sft-8b-v2-segment-ppo-60k
Text Generation
•
Updated
Jan 8
•
19
yyqoni/rlhflow-llama-3-sft-8b-v2-token-ppo-60k
Text Generation
•
Updated
Jan 8
•
19
yyqoni/rlhflow-llama-3-sft-8b-v2-bandit-ppo-60k
Text Generation
•
Updated
Jan 8
•
20
yyqoni/meta-llama-3.1-instruct-8b-segment-ppo-60k
Text Generation
•
Updated
Jan 8
•
18
yyqoni/meta-llama-3.1-instruct-8b-token-ppo-60k
Text Generation
•
Updated
Jan 8
•
25
yyqoni/meta-llama-3.1-instruct-8b-bandit-ppo-60k
Text Generation
•
Updated
Jan 8
•
19
yyqoni/Phi-3-mini-4k-token-ppo-60k
Text Generation
•
Updated
Jan 8
•
18
yyqoni/Phi-3-mini-4k-segment-ppo-60k
Text Generation
•
Updated
Jan 8
•
27
yyqoni/Phi-3-mini-4k-bandit-ppo-60k
Text Generation
•
Updated
Jan 10
•
23