Submitted by luojunyu 85 RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response · 6 authors 2
Submitted by AndrewZeng 46 B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners · 6 authors 2
Submitted by fjxmlzn 34 Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching · 4 authors 2
Submitted by luyangl 30 Deliberation in Latent Space via Differentiable Cache Augmentation · 5 authors 5
Submitted by jinheon 30 Revisiting In-Context Learning with Long Context Language Models · 7 authors 2
Submitted by akhaliq 21 DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought · 4 authors 4
Submitted by Vfrz 12 PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital World · 8 authors 2
Submitted by sdzy 9 OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning · 6 authors 2
Submitted by ColorfulAI 9 Friends-MMC: A Dataset for Multi-modal Multi-party Conversation Understanding · 6 authors 2