Collections
Discover the best community collections!
Collections including paper arxiv:2305.14387
-
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Paper • 2403.18421 • Published • 23 -
Long-form factuality in large language models
Paper • 2403.18802 • Published • 25 -
stanford-crfm/BioMedLM
Text Generation • Updated • 4.3k • 412 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 52
-
Proximal Policy Optimization Algorithms
Paper • 1707.06347 • Published • 6 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 52 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 146 -
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 16