Sugato Ray's picture

Sugato Ray PRO

sugatoray

AI & ML interests

None yet

Recent Activity

Organizations

Spaces-explorers's profile picture HugGAN Community's profile picture U-Py's profile picture Kornia AI's profile picture ZeroGPU Explorers's profile picture MLX Community's profile picture Social Post Explorers's profile picture Hugging Face 1Bit LLMs's profile picture Hugging Face Discord Community's profile picture open/ acc's profile picture TBD's profile picture

sugatoray's activity

upvoted an article about 9 hours ago
view article
Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

By NormalUhr
6
New activity in Zyphra/Zonos-v0.1-hybrid about 12 hours ago
upvoted an article about 13 hours ago
reacted to lewtun's post with ❤️ about 13 hours ago
view post
Post
2625
Introducing OpenR1-Math-220k!

open-r1/OpenR1-Math-220k

The community has been busy distilling DeepSeek-R1 from inference providers, but we decided to have a go at doing it ourselves from scratch 💪

What’s new compared to existing reasoning datasets?

♾ Based on AI-MO/NuminaMath-1.5: we focus on math reasoning traces and generate answers for problems in NuminaMath 1.5, an improved version of the popular NuminaMath-CoT dataset.

🐳 800k R1 reasoning traces: We generate two answers for 400k problems using DeepSeek R1. The filtered dataset contains 220k problems with correct reasoning traces.

📀 512 H100s running locally: Instead of relying on an API, we leverage vLLM and SGLang to run generations locally on our science cluster, generating 180k reasoning traces per day.

⏳ Automated filtering: We apply Math Verify to only retain problems with at least one correct answer. We also leverage Llama3.3-70B-Instruct as a judge to retrieve more correct examples (e.g for cases with malformed answers that can’t be verified with a rules-based parser)

📊 We match the performance of DeepSeek-Distill-Qwen-7B by finetuning Qwen-7B-Math-Instruct on our dataset.

🔎 Read our blog post for all the nitty gritty details: https://huggingface.co/blog/open-r1/update-2
upvoted an article about 13 hours ago
view article
Article

Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset

By sdiazlor
21
upvoted an article 2 days ago
view article
Article

How to deploy and fine-tune DeepSeek models on AWS

40
upvoted an article 3 days ago
view article
Article

DABStep: Data Agent Benchmark for Multi-step Reasoning

42