XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model Paper • 2406.04904 • Published Jun 7, 2024 • 8
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System Paper • 2502.05512 • Published 6 days ago • 1
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis Paper • 2502.04128 • Published 8 days ago • 22
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub 3 days ago • 40
FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks Paper • 2502.04465 • Published 8 days ago • 3
Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning Paper • 2502.06533 • Published 4 days ago • 13
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 4 days ago • 116
VideoRoPE: What Makes for Good Video Rotary Position Embedding? Paper • 2502.05173 • Published 7 days ago • 60
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 10 days ago • 161
Reward-Guided Speculative Decoding for Efficient LLM Reasoning Paper • 2501.19324 • Published 14 days ago • 35
GuardReasoner: Towards Reasoning-based LLM Safeguards Paper • 2501.18492 • Published 15 days ago • 81
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published 16 days ago • 53
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 17 days ago • 105
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Paper • 1712.05884 • Published Dec 16, 2017 • 3