忍者

byteprobe

AI & ML interests

RL | NLP | LLM | multimodal | agent

Recent Activity

liked a model about 14 hours ago

tomg-group-umd/huginn-0125

upvoted a collection about 14 hours ago

Nomic Embed v2

liked a model about 14 hours ago

nomic-ai/nomic-embed-text-v2-moe

View all activity

Organizations

byteprobe's activity

upvoted a collection about 14 hours ago

Nomic Embed v2

Collection

Multilingual Embedding Models • 3 items • Updated 2 days ago • 10

upvoted 16 papers about 14 hours ago

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published 16 days ago • 53

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published 9 days ago • 50

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published 7 days ago • 89

Baichuan-Omni-1.5 Technical Report

Paper • 2501.15368 • Published 20 days ago • 56

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 15 days ago • 52

GuardReasoner: Towards Reasoning-based LLM Safeguards

Paper • 2501.18492 • Published 15 days ago • 81

The Differences Between Direct Alignment Algorithms are a Blur

Paper • 2502.01237 • Published 11 days ago • 112

s1: Simple test-time scaling

Paper • 2501.19393 • Published 14 days ago • 100

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 17 days ago • 105

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published 14 days ago • 35

OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published 11 days ago • 171

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 10 days ago • 161

upvoted an article about 14 hours ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

23 days ago

• 130

upvoted a paper about 14 hours ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 4 days ago • 116

upvoted an article about 14 hours ago

Article

1 Billion Classifications

1 day ago

• 33