Ultravox v0.5 Collection Ultravox is a multimodal Speech LLM built around different pretrained LLMs (frozen) and the whisper-large-v3-turbo (fine-tuned) backbone. • 3 items • Updated 1 day ago • 3
R3GAN Collection R3GAN: A Modern BaselineGAN https://github.com/brownvc/R3GAN/ https://arxiv.org/abs/2501.05441 • 7 items • Updated Jan 10 • 10
ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features Paper • 2502.04320 • Published 5 days ago • 30
Material Anything: Generating Materials for Any 3D Object via Diffusion Paper • 2411.15138 • Published Nov 22, 2024 • 44
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control 8 days ago • 92
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up Paper • 2412.16112 • Published Dec 20, 2024 • 22
Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion Paper • 2402.03162 • Published Feb 5, 2024 • 19
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling Paper • 2401.15977 • Published Jan 29, 2024 • 38
Open Image Preferences Collection Containing all artifacts for the Stable Diffusion 3.5L vs Flux Dev image preference community sprint. • 14 items • Updated Dec 19, 2024 • 9
view article Article Crowd-sourced Open Preference Dataset for Text-to-Image Generation By RapidataAI and 4 others • Jan 7 • 18
Lucie LLM Collection Open source LLM for French, English, German, Spanish and Italian • 8 items • Updated 8 days ago • 18
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator Paper • 2411.15466 • Published Nov 23, 2024 • 35
Graph-Aware Isomorphic Attention in Transformers Collection We present an approach to modifying Transformer architectures by integrating graph-aware relational reasoning into the attention mechanism. • 4 items • Updated Jan 9 • 2
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models Paper • 2407.15886 • Published Jul 21, 2024 • 3
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering Paper • 2408.09702 • Published Aug 19, 2024 • 11