Submitted by haotiz 55 MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning · 23 authors 3
Submitted by Lemoncoke 28 Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models · 8 authors 2
Submitted by SiyuanH 14 UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models · 12 authors 4
Submitted by liruiw 13 Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers · 4 authors 2
Submitted by hyungjoochae 9 Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code · 10 authors 3
Submitted by ShuoChen99 8 Visual Question Decomposition on Multimodal Large Language Models · 8 authors 2
Submitted by zhangxulong 1 IDEAW: Robust Neural Audio Watermarking with Invertible Dual-Embedding · 4 authors 2