Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.07199

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12, 2024 • 65
sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28, 2024 • 41
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 48
Best Practices and Lessons Learned on Synthetic Data for Language Models

Paper • 2404.07503 • Published Apr 11, 2024 • 30

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Paper • 2306.08568 • Published Jun 14, 2023 • 28
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

Paper • 2408.07199 • Published Aug 13, 2024 • 21

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs