-
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Paper • 2311.14495 • Published • 1 -
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 61 -
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation
Paper • 2401.13560 • Published • 1 -
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces
Paper • 2402.00789 • Published • 2
Collections
Discover the best community collections!
Collections including paper arxiv:2311.18257
-
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Paper • 2311.14495 • Published • 1 -
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 61 -
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation
Paper • 2401.13560 • Published • 1 -
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces
Paper • 2402.00789 • Published • 2
-
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 61 -
VMamba: Visual State Space Model
Paper • 2401.10166 • Published • 40 -
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation
Paper • 2401.13560 • Published • 1 -
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces
Paper • 2402.00789 • Published • 2
-
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
Paper • 2312.04410 • Published • 15 -
Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling
Paper • 2310.06389 • Published • 1 -
Diffusion Model Alignment Using Direct Preference Optimization
Paper • 2311.12908 • Published • 50 -
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Paper • 2305.13655 • Published • 7
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 16 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 27 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 9 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20