Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.12376

RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization

Paper • 2403.00483 • Published Mar 1, 2024 • 15
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Paper • 2403.01779 • Published Mar 4, 2024 • 30
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers

Paper • 2401.11605 • Published Jan 21, 2024 • 22
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19, 2024 • 48

All the papers I found interesting

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Paper • 2402.19481 • Published Feb 29, 2024 • 22
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19, 2024 • 48
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

Paper • 2402.17193 • Published Feb 27, 2024 • 25
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Paper • 2403.05135 • Published Mar 8, 2024 • 43

Training-Free Consistent Text-to-Image Generation

Paper • 2402.03286 • Published Feb 5, 2024 • 67
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation

Paper • 2402.04324 • Published Feb 6, 2024 • 24
λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space

Paper • 2402.05195 • Published Feb 7, 2024 • 19
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19, 2024 • 48

Daily paper that is inspiring (abstract is enough)

World Model on Million-Length Video And Language With RingAttention

Paper • 2402.08268 • Published Feb 13, 2024 • 38
Improving Text Embeddings with Large Language Models

Paper • 2401.00368 • Published Dec 31, 2023 • 80
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 105
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19, 2024 • 48

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Paper • 2401.08417 • Published Jan 16, 2024 • 35
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models

Paper • 2401.05252 • Published Jan 10, 2024 • 48
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

Paper • 2402.04291 • Published Feb 6, 2024 • 49
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19, 2024 • 48

Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models

Paper • 2312.09608 • Published Dec 15, 2023 • 16
CodeFusion: A Pre-trained Diffusion Model for Code Generation

Paper • 2310.17680 • Published Oct 26, 2023 • 69
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image

Paper • 2310.17994 • Published Oct 27, 2023 • 8
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss

Paper • 2401.02677 • Published Jan 5, 2024 • 23

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

Paper • 2312.04410 • Published Dec 7, 2023 • 15
Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling

Paper • 2310.06389 • Published Oct 10, 2023 • 1
Diffusion Model Alignment Using Direct Preference Optimization

Paper • 2311.12908 • Published Nov 21, 2023 • 50
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Paper • 2305.13655 • Published May 23, 2023 • 7

Diffusion models

FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline

Paper • 2311.13073 • Published Nov 22, 2023 • 58
MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry and Texture

Paper • 2311.10123 • Published Nov 16, 2023 • 18
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning

Paper • 2311.12631 • Published Nov 21, 2023 • 15
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

Paper • 2312.00845 • Published Dec 1, 2023 • 39

Papers that involves geometric computer vision using kornia

A survey on Kornia: an Open Source Differentiable Computer Vision Library for PyTorch

Paper • 2009.10521 • Published Sep 21, 2020 • 1
Kornia: an Open Source Differentiable Computer Vision Library for PyTorch

Paper • 1910.02190 • Published Oct 5, 2019 • 1
Learning Symmetrization for Equivariance with Orbit Distance Minimization

Paper • 2311.07143 • Published Nov 13, 2023 • 1
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting

Paper • 2311.11700 • Published Nov 20, 2023 • 4

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Paper • 2311.10093 • Published Nov 16, 2023 • 58
NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation

Paper • 2311.12229 • Published Nov 20, 2023 • 27
Diffusion Model Alignment Using Direct Preference Optimization

Paper • 2311.12908 • Published Nov 21, 2023 • 50
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

Paper • 2312.00845 • Published Dec 1, 2023 • 39

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs