Submitted by EilamSha 81 GLEE: A Unified Framework and Benchmark for Language-based Economic Environments · 6 authors 2
Submitted by FanqingM 45 Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation · 10 authors 3
Submitted by akhaliq 43 F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching · 8 authors 6
Submitted by comin 42 IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation · 9 authors 2
Submitted by feifeiobama 39 Pyramidal Flow Matching for Efficient Video Generative Modeling · 11 authors 2
Submitted by myownskyW7 38 Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate · 9 authors 2
Submitted by ZedongWangAI 36 Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning · 9 authors 3
Submitted by alielfilali01 33 Falcon Mamba: The First Competitive Attention-free 7B Language Model · 7 authors 2
Submitted by xk-huang 19 Story-Adapter: A Training-free Iterative Framework for Long Story Visualization · 7 authors 2
Submitted by Windy 16 Self-Boosting Large Language Models with Synthetic Preference Data · 5 authors 1
Submitted by paischer101 15 One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation · 6 authors 2
Submitted by akhaliq 14 T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design · 7 authors 2
Submitted by akhaliq 13 TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation · 2 authors 2
Submitted by akhaliq 13 ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler · 3 authors 2
Submitted by seokhyun 12 Response Tuning: Aligning Large Language Models without Instruction · 2 authors 2
Submitted by akhaliq 12 AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs · 10 authors 3
Submitted by zbhpku 11 Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis · 12 authors 3
Submitted by myownskyW7 10 BroadWay: Boost Your Text-to-Video Generation Model in a Training-free Way · 9 authors 2
Submitted by Yongxin-Guo 9 TRACE: Temporal Grounding Video LLM via Causal Event Modeling · 6 authors 3
Submitted by thomas-ferraz 8 LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints · 10 authors 2
Submitted by paischer101 7 Retrieval-Augmented Decision Transformer: External Memory for In-context RL · 6 authors 2
Submitted by akhaliq 7 FürElise: Capturing and Physically Synthesizing Hand Motions of Piano Performance · 5 authors 4
Submitted by Minjong 7 Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning · 4 authors 2
Submitted by liuganghuggingface 7 Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning · 5 authors 2
Submitted by dnoever 7 Hallucinating AI Hijacking Attack: Large Language Models and Malicious Code Recommenders · 2 authors 2
Submitted by tnlin 6 MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering · 12 authors 2
Submitted by CiaraRowles 5 Jointly Generating Multi-view Consistent PBR Textures using Collaborative Control · 6 authors 2
Submitted by jindongwang 5 MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders · 7 authors 2
Submitted by XUANMINGZHANG 5 Seeker: Enhancing Exception Handling in Code with LLM-based Multi-Agent Approach · 4 authors 3
Submitted by wenhu 4 VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks · 6 authors 2
Submitted by ggcristian 4 TinyEmo: Scaling down Emotional Reasoning via Metric Projection · 1 authors 2
Submitted by zhoutianyi 4 Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA · 4 authors 2
Submitted by kargaranamir 3 MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment · 6 authors 2
Submitted by chen-yingfa 2 Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling · 6 authors 3