Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.12245

Scalable Autoregressive Image Generation with Mamba

hp-l33/aim-base

Unconditional Image Generation • Updated Aug 27, 2024 • 12 • 1
hp-l33/aim-large

Unconditional Image Generation • Updated Aug 27, 2024 • 8 • 1
hp-l33/aim-xlarge

Unconditional Image Generation • Updated Aug 27, 2024 • 22 • 4
Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22, 2024 • 26

Stable Diffusion

Zero-shot Image Editing with Reference Imitation

Paper • 2406.07547 • Published Jun 11, 2024 • 32
Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22, 2024 • 26
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 29
Training-free Regional Prompting for Diffusion Transformers

Paper • 2411.02395 • Published Nov 4, 2024 • 25

Image-Gen Autoregressive

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 67
Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling

Paper • 2405.21048 • Published May 31, 2024 • 14
Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22, 2024 • 26
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation

Paper • 2410.08159 • Published Oct 10, 2024 • 25

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6, 2024 • 31
Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31, 2024 • 16
Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22, 2024 • 26
Jamba-1.5: Hybrid Transformer-Mamba Models at Scale

Paper • 2408.12570 • Published Aug 22, 2024 • 33

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 26
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 41
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 22

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 17
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 60
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 74

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning

Paper • 2311.10709 • Published Nov 17, 2023 • 25
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

Paper • 2405.12970 • Published May 21, 2024 • 24
FIFO-Diffusion: Generating Infinite Videos from Text without Training

Paper • 2405.11473 • Published May 19, 2024 • 53
stabilityai/stable-diffusion-3-medium

Text-to-Image • Updated Aug 12, 2024 • 29.7k • • 4.7k

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs