The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation Paper • 2503.04606 • Published 7 days ago • 7
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models Paper • 2403.03100 • Published Mar 5, 2024 • 36
PromptTTS 2: Describing and Generating Voices with Text Prompt Paper • 2309.02285 • Published Sep 5, 2023 • 13