Submitted by ellisbrown 60 Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs · 14 authors 4
Submitted by yuangpeng 55 DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation · 10 authors 4
Submitted by terryyz 46 BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions · 33 authors 8
Submitted by Royir 35 Evaluating D-MERIT of Partial-annotation on Information Retrieval · 7 authors 2
Submitted by zlzheng 26 VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models · 5 authors 2
Submitted by YiDuo1999 20 Efficient Continual Pre-training by Mitigating the Stability Gap · 5 authors 1
Submitted by Kthyeon 20 Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters · 5 authors 3
Submitted by zlzheng 19 Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers · 4 authors 1
Submitted by ShengdingHu 14 Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models · 9 authors 2
Submitted by jlko 13 Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs · 6 authors 1
Submitted by yongzx 11 Preference Tuning For Toxicity Mitigation Generalizes Across Languages · 3 authors 1
Submitted by CCCCCC 10 AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models · 9 authors 2
Submitted by sherzod-hakimov 9 How Many Parameters Does it Take to Change a Light Bulb? Evaluating Performance in Self-Play of Conversational Games as a Function of Model Characteristics · 4 authors 1
Submitted by cydhsieh01 6 Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization · 11 authors 1
Submitted by cattana 5 Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations · 11 authors 1
Submitted by BrianatCambridge 5 video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models · 10 authors 1
Submitted by nicozilber 4 Repulsive Score Distillation for Diverse Sampling of Diffusion Models · 3 authors 2
Submitted by SinclairWang 2 OlympicArena Medal Ranks: Who Is the Most Intelligent AI So Far? · 4 authors 2