41 430 568

Sugato Ray PRO

sugatoray

https://linkedin.com/in/sugatoray

AI & ML interests

None yet

Recent Activity

upvoted an article about 9 hours ago

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

updated a collection about 12 hours ago

AV LLMs

liked a model about 12 hours ago

Zyphra/Zonos-v0.1-transformer

View all activity

Organizations

sugatoray's activity

upvoted an article about 9 hours ago

Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

•

7 days ago

• 6

updated a collection about 12 hours ago

AV LLMs

Collection

A collection of Audio, Video and Visual LLMs. • 91 items • Updated about 12 hours ago • 3

liked a model about 12 hours ago

Zyphra/Zonos-v0.1-transformer

Updated 1 day ago • 2.88k • 170

updated a model about 12 hours ago

Zyphra/Zonos-v0.1-hybrid

Text-to-Speech • Updated about 4 hours ago • 874 • 551

New activity in Zyphra/Zonos-v0.1-hybrid about 12 hours ago

Update README with color formatted Python code block

#11 opened about 12 hours ago by

sugatoray

updated a collection about 12 hours ago

AV LLMs

Collection

A collection of Audio, Video and Visual LLMs. • 91 items • Updated about 12 hours ago • 3

liked a model about 12 hours ago

Zyphra/Zonos-v0.1-hybrid

Text-to-Speech • Updated about 4 hours ago • 874 • 551

upvoted an article about 13 hours ago

Article

Open R1: Update #2

and 6 others •

1 day ago

• 121

updated 2 collections about 13 hours ago

LLM Training Datasets

Collection

A collection of datasets for training LLMs. • 100 items • Updated about 13 hours ago • 17

LLMs

Collection

Collection of LLMs • 280 items • Updated about 13 hours ago • 1

liked a dataset about 13 hours ago

open-r1/OpenR1-Math-220k

Viewer • Updated about 16 hours ago • 225k • 260 • 144

reacted to lewtun's post with ❤️ about 13 hours ago

Post

2625

Introducing OpenR1-Math-220k!

open-r1/OpenR1-Math-220k

The community has been busy distilling DeepSeek-R1 from inference providers, but we decided to have a go at doing it ourselves from scratch 💪

What’s new compared to existing reasoning datasets?

♾ Based on AI-MO/NuminaMath-1.5: we focus on math reasoning traces and generate answers for problems in NuminaMath 1.5, an improved version of the popular NuminaMath-CoT dataset.

🐳 800k R1 reasoning traces: We generate two answers for 400k problems using DeepSeek R1. The filtered dataset contains 220k problems with correct reasoning traces.

📀 512 H100s running locally: Instead of relying on an API, we leverage vLLM and SGLang to run generations locally on our science cluster, generating 180k reasoning traces per day.

⏳ Automated filtering: We apply Math Verify to only retain problems with at least one correct answer. We also leverage Llama3.3-70B-Instruct as a judge to retrieve more correct examples (e.g for cases with malformed answers that can’t be verified with a rules-based parser)

📊 We match the performance of DeepSeek-Distill-Qwen-7B by finetuning Qwen-7B-Math-Instruct on our dataset.

🔎 Read our blog post for all the nitty gritty details: https://huggingface.co/blog/open-r1/update-2