Slamming: Training a Speech Language Model on One GPU in a Day Paper • 2502.15814 • Published Feb 19, 2025 • 69
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published Mar 6, 2025 • 72
Audio-Aware Large Language Models as Judges for Speaking Styles Paper • 2506.05984 • Published Jun 6, 2025 • 15
Optimizing Multilingual Text-To-Speech with Accents & Emotions Paper • 2506.16310 • Published Jun 19, 2025 • 26
VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing Paper • 2509.22651 • Published Sep 26, 2025 • 22