Training Consistency Models with Variational Noise Coupling Paper • 2502.18197 • Published 17 days ago • 6
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper • 2412.15322 • Published Dec 19, 2024 • 18
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models Paper • 2410.06154 • Published Oct 8, 2024 • 16