Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks Paper • 2501.08326 • Published 29 days ago • 32
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks Paper • 2501.08326 • Published 29 days ago • 32
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models Paper • 2412.01822 • Published Dec 2, 2024 • 15
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models Paper • 2412.01822 • Published Dec 2, 2024 • 15