π TinyWave: Compact & Expressive Speech Language Models
TinyWave is a family of efficient 2B-parameter speech language models distilled from the 7B SPIRIT-LM teacher. These models support speech-to-speech and interleaved speechβtext generation, optimized for real-time use on commodity hardware.
Built through layer-aligned knowledge distillation, TinyWave models retain 93β97% of their teacherβs performance while using only β of the parameters β ideal for use in voice agents, assistive technologies, and edge devices.
π Read the paper: Efficient Interleaved Speech Modeling through Knowledge Distillation (arXiv:2506.23670)
π Demo & samples: tinywave-landing
π» Code: github.com/mohammadmahdinoori/TinyWave
π§ Model Variants
Model | Modality | Tokenizer | Description |
---|---|---|---|
tinywave/speech-base-2b |
Speech β Speech | spiritlm_base |
Base phonetic-only speech generation |
tinywave/speech-expressive-2b |
Speech β Expressive Speech | spiritlm_expressive |
Includes pitch + style tokens |
tinywave/interleaved-expressive-2b |
Text β Speech (interleaved) | spiritlm_expressive |
Multimodal expressive generation |
tinywave/expressive-spirit-lm-interleaved-librilight |
Teacher (7B, interleaved) | spiritlm_expressive |
LoRA-corrected SPIRIT-LM for distillation |