7 167 9

Robin Williams PRO

bfuzzy1

AI & ML interests

None yet

Recent Activity

updated a collection about 16 hours ago

Nifty

upvoted a paper about 16 hours ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

upvoted a paper 3 days ago

Great Models Think Alike and this Undermines AI Oversight

View all activity

Organizations

None yet

bfuzzy1's activity

upvoted a paper about 16 hours ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published 4 days ago • 50

upvoted 3 papers 3 days ago

upvoted 4 papers 4 days ago

Large Language Model Guided Self-Debugging Code Generation

Paper • 2502.02928 • Published 7 days ago • 8

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published 7 days ago • 49

LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published 7 days ago • 44

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 7 days ago • 154

upvoted a paper 12 days ago

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published 14 days ago • 51

upvoted a paper 14 days ago

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published 16 days ago • 24

upvoted 2 papers 15 days ago

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published 19 days ago • 22

SRMT: Shared Memory for Multi-agent Lifelong Pathfinding

Paper • 2501.13200 • Published 20 days ago • 62

upvoted a paper 17 days ago

Control LLM: Controlled Evolution for Intelligence Retention in LLM

Paper • 2501.10979 • Published 24 days ago • 6

upvoted 3 papers 19 days ago

O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning

Paper • 2501.12570 • Published 21 days ago • 23

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 21 days ago • 91

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 21 days ago • 315

upvoted a paper 20 days ago

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published 23 days ago • 90

upvoted a paper 22 days ago

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published 26 days ago • 105

upvoted 2 papers 27 days ago

MangaNinja: Line Art Colorization with Precise Reference Following

Paper • 2501.08332 • Published 28 days ago • 56

ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning

Paper • 2501.06590 • Published Jan 11 • 9