Submitted by zhao1iang 51 Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On · 12 authors 5
Submitted by zwq2018 43 Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model · 11 authors 3
Submitted by Kyriection 31 Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients · 7 authors 3
Submitted by akhaliq 22 SEED-Story: Multimodal Long Story Generation with Large Language Model · 7 authors 5
Submitted by PeterV09 21 Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist · 9 authors 4
Submitted by akhaliq 17 DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception · 6 authors 2
Submitted by akhaliq 10 Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models · 7 authors 2
Submitted by zhenqincn 10 The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective · 8 authors 4
Submitted by iseesaw 9 Towards Building Specialized Generalist AI with System 1 and System 2 Fusion · 3 authors 2
Submitted by akhaliq 9 Generalizable Implicit Motion Modeling for Video Frame Interpolation · 3 authors 2
Submitted by NikV09 8 Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data · 10 authors 4
Submitted by akhaliq 6 OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects · 5 authors 2
Submitted by YeolJoo 3 Scaling Up Personalized Aesthetic Assessment via Task Vector Customization · 2 authors 3