Submitted by dongguanting 77 We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? · 18 authors 9
Submitted by hba123 60 ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning · 22 authors 6
Submitted by SivilTaram 36 RegMix: Data Mixture as Regression for Language Model Pre-training · 8 authors 7
Submitted by leonardPKU 35 MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation · 16 authors 2
Submitted by AJZhou 24 Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning · 7 authors 4
Submitted by Koi953215 23 DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models · 6 authors 5
Submitted by omergoldman 22 Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP · 6 authors 1
Submitted by wanghaofan 22 InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation · 6 authors 5
Submitted by naoyuki82 21 E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS · 13 authors 3
Submitted by yingtai 19 RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network · 10 authors 2
Submitted by zhwang4ai 12 OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents · 10 authors 5
Submitted by LXT 11 Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language · 7 authors 2
Submitted by Neph0s 11 Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs · 6 authors 2
Submitted by Shijie 10 T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge · 7 authors 1
Submitted by wanchichen 10 Towards Robust Speech Representation Learning for Thousands of Languages · 10 authors 1
Submitted by akhaliq 9 SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix · 8 authors 1
Submitted by davanstrien 8 Show Less, Instruct More: Enriching Prompts with Definitions and Guidelines for Zero-Shot NER · 5 authors 1
Submitted by gsarti 5 Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs · 4 authors 4
Submitted by BFauber 5 Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models · 1 authors 2
Submitted by iliashum 5 UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI · 9 authors 1
Submitted by hank0316 5 DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging · 4 authors 1
Submitted by JRQi 3 The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models · 7 authors 1