Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published 4 days ago • 50
Great Models Think Alike and this Undermines AI Oversight Paper • 2502.04313 • Published 5 days ago • 24
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 Paper • 2502.03544 • Published 6 days ago • 37
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models Paper • 2502.03032 • Published 7 days ago • 53
Large Language Model Guided Self-Debugging Code Generation Paper • 2502.02928 • Published 7 days ago • 8
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 7 days ago • 154
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published 14 days ago • 51
Towards General-Purpose Model-Free Reinforcement Learning Paper • 2501.16142 • Published 16 days ago • 24
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding Paper • 2501.13200 • Published 20 days ago • 62
view post Post 2076 Only a single RTX 4090 running model pre-training is really slow, even for small language models!!! (https://huggingface.co/collections/JingzeShi/doge-slm-677fd879f8c4fd0f43e05458) See translation 2 replies · 👀 8 8 🤯 6 6 👍 4 4 + Reply
Control LLM: Controlled Evolution for Intelligence Retention in LLM Paper • 2501.10979 • Published 24 days ago • 6