RLAIF

Enterprise

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

agamemnon updated a model 1 day ago

RLAIF/llama-3b-open-r1-50k-sft

agamemnon published a model 1 day ago

RLAIF/llama-3b-open-r1-50k-sft

nlile authored a paper 6 days ago

Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models

View all activity

RLAIF's activity

agamemnon

updated a model 1 day ago

RLAIF/llama-3b-open-r1-50k-sft

Updated 1 day ago • 157

agamemnon

published a model 1 day ago

RLAIF/llama-3b-open-r1-50k-sft

Updated 1 day ago • 157

nlile

authored a paper 6 days ago

Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models

Paper • 2502.17387 • Published 17 days ago • 5

Asap7772

authored a paper 9 days ago

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published 10 days ago • 31

nlile

authored a paper 9 days ago

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published 10 days ago • 31

alon-albalak

authored a paper 15 days ago

Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models

Paper • 2502.17387 • Published 17 days ago • 5

violetxi

authored 2 papers about 2 months ago

Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models

Paper • 2407.07086 • Published Jul 9, 2024

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 91

LouisCastricato

authored a paper about 2 months ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 91

alon-albalak

authored a paper 2 months ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 91

nlile

authored a paper 2 months ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 91

Asap7772

authored a paper 2 months ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 91

alon-albalak

authored a paper 3 months ago

Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models

Paper • 2412.02980 • Published Dec 4, 2024 • 13

nlile

authored a paper 5 months ago

Generative Reward Models

Paper • 2410.12832 • Published Oct 2, 2024 • 6

alon-albalak

authored a paper 5 months ago

Generative Reward Models

Paper • 2410.12832 • Published Oct 2, 2024 • 6

Asap7772

authored a paper 5 months ago

Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation

Paper • 2410.02725 • Published Oct 3, 2024 • 1

Asap7772

authored 4 papers 7 months ago

AI & ML interests

Recent Activity

Team members 11

RLAIF's activity