Submitted by akhaliq 25 InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding · 18 authors 4
Submitted by akhaliq 16 Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance · 8 authors 2
Submitted by akhaliq 15 ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars · 5 authors 1
Submitted by akhaliq 13 SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series · 2 authors 1
Submitted by akhaliq 11 DragAPart: Learning a Part-Level Motion Prior for Articulated Objects · 4 authors 1
Submitted by akhaliq 11 FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions · 8 authors 1
Submitted by akhaliq 10 AllHands: Ask Me Anything on Large-scale Verbatim Feedback via Large Language Models · 15 authors 2