-
Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding
Paper • 2401.04575 • Published • 17 -
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 61 -
VMamba: Visual State Space Model
Paper • 2401.10166 • Published • 40 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 22
Collections
Discover the best community collections!
Collections including paper arxiv:2401.10166
-
TCNCA: Temporal Convolution Network with Chunked Attention for Scalable Sequence Processing
Paper • 2312.05605 • Published • 3 -
VMamba: Visual State Space Model
Paper • 2401.10166 • Published • 40 -
Rethinking Patch Dependence for Masked Autoencoders
Paper • 2401.14391 • Published • 25 -
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Paper • 2401.14404 • Published • 18
-
havenhq/mamba-chat
Updated • 291 • 100 -
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts
Paper • 2401.04081 • Published • 71 -
VMamba: Visual State Space Model
Paper • 2401.10166 • Published • 40 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 108
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 16 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 27 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 9 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20
-
Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images
Paper • 2308.16582 • Published • 12 -
DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation
Paper • 2310.13119 • Published • 13 -
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper • 2310.16818 • Published • 32 -
Text-to-3D with classifier score distillation
Paper • 2310.19415 • Published • 5