Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 16 days ago • 338
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 125
view article Article Docmatix - a huge dataset for Document Visual Question Answering Jul 18, 2024 • 72
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6, 2024 • 91
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15, 2024 • 174