Submitted by CuiLong7 2 ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution OpenGVLab 2
Submitted by linghan199 3 ExpVid: A Benchmark for Experiment Video Understanding & Reasoning OpenGVLab 6 2
Submitted by Changyao 19 NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints OpenGVLab 80 2
Submitted by SongzeLi 3 Learning Goal-Oriented Language-Guided Navigation with Self-Improving Demonstrations at Scale OpenGVLab 8 1