🌊 TinyWave: Compact & Expressive Speech Language Models

TinyWave is a family of efficient 2B-parameter speech language models distilled from the 7B SPIRIT-LM teacher. These models support speech-to-speech and interleaved speech–text generation, optimized for real-time use on commodity hardware.

Built through layer-aligned knowledge distillation, TinyWave models retain 93–97% of their teacher’s performance while using only ⅓ of the parameters — ideal for use in voice agents, assistive technologies, and edge devices.

📖 Read the paper: Efficient Interleaved Speech Modeling through Knowledge Distillation (arXiv:2506.23670)
🌐 Demo & samples: tinywave-landing
💻 Code: github.com/mohammadmahdinoori/TinyWave

🔧 Model Variants

Model	Modality	Tokenizer	Description
`tinywave/speech-base-2b`	Speech → Speech	`spiritlm_base`	Base phonetic-only speech generation
`tinywave/speech-expressive-2b`	Speech → Expressive Speech	`spiritlm_expressive`	Includes pitch + style tokens
`tinywave/interleaved-expressive-2b`	Text ↔ Speech (interleaved)	`spiritlm_expressive`	Multimodal expressive generation
`tinywave/expressive-spirit-lm-interleaved-librilight`	Teacher (7B, interleaved)	`spiritlm_expressive`	LoRA-corrected SPIRIT-LM for distillation