MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency Paper • 2502.09621 • Published about 20 hours ago • 11
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency Paper • 2502.09621 • Published about 20 hours ago • 11
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency Paper • 2502.09621 • Published about 20 hours ago • 11 • 1
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published 7 days ago • 89
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models Paper • 2501.13920 • Published 22 days ago • 15
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Paper • 2501.13926 • Published 22 days ago • 36
VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping Paper • 2412.11279 • Published Dec 15, 2024 • 12
MoVA: Adapting Mixture of Vision Experts to Multimodal Context Paper • 2404.13046 • Published Apr 19, 2024 • 1