laion/CLIP-ViT-H-14-laion2B-s32B-b79K Zero-Shot Image Classification • Updated 24 days ago • 1.07M • 356
TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space Paper • 2501.12224 • Published 24 days ago • 46
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding Paper • 2501.12380 • Published 24 days ago • 82
HuggingFaceM4/siglip-so400m-14-980-flash-attn2-navit Zero-Shot Image Classification • Updated Mar 7, 2024 • 5.71k • 44
ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding Paper • 2501.05452 • Published Jan 9 • 15