Spaces:

tinywave
/

README

Configuration error

App Files Files Community

README / README.md

mohammadmahdinouri

Update README.md

de35df9 verified about 2 months ago

preview code

raw

history blame contribute delete

2.09 kB

	# 🌊 TinyWave: Compact & Expressive Speech Language Models

	TinyWave is a family of efficient 2B-parameter speech language models distilled from the 7B SPIRIT-LM teacher. These models support speech-to-speech and interleaved speech–text generation, optimized for real-time use on commodity hardware.

	Built through layer-aligned knowledge distillation, TinyWave models retain 93–97% of their teacher’s performance while using only ⅓ of the parameters — ideal for use in voice agents, assistive technologies, and edge devices.

	> 📖 Read the paper: [Efficient Interleaved Speech Modeling through Knowledge Distillation (arXiv:2506.23670)](https://arxiv.org/abs/2506.23670)
	> 🌐 Demo & samples: [tinywave-landing](https://mohammadmahdinoori.github.io/tinywave-landing/)
	> 💻 Code: [github.com/mohammadmahdinoori/TinyWave](https://github.com/mohammadmahdinoori/TinyWave)

	---

	## 🔧 Model Variants

	\| Model \| Modality \| Tokenizer \| Description \|
	\|--------------------------------------------------------------\|---------------------------\|------------------------\|---------------------------------------------\|
	\| [`tinywave/speech-base-2b`](https://huggingface.co/tinywave/speech-base-2b) \| Speech → Speech \| `spiritlm_base` \| Base phonetic-only speech generation \|
	\| [`tinywave/speech-expressive-2b`](https://huggingface.co/tinywave/speech-expressive-2b) \| Speech → Expressive Speech \| `spiritlm_expressive` \| Includes pitch + style tokens \|
	\| [`tinywave/interleaved-expressive-2b`](https://huggingface.co/tinywave/interleaved-expressive-2b) \| Text ↔ Speech (interleaved) \| `spiritlm_expressive` \| Multimodal expressive generation \|
	\| [`tinywave/expressive-spirit-lm-interleaved-librilight`](https://huggingface.co/tinywave/expressive-spirit-lm-interleaved-librilight) \| Teacher (7B, interleaved) \| `spiritlm_expressive` \| LoRA-corrected SPIRIT-LM for distillation \|

	# 🌊 TinyWave: Compact & Expressive Speech Language Models

	TinyWave is a family of efficient 2B-parameter speech language models distilled from the 7B SPIRIT-LM teacher. These models support speech-to-speech and interleaved speech–text generation, optimized for real-time use on commodity hardware.

	Built through layer-aligned knowledge distillation, TinyWave models retain 93–97% of their teacher’s performance while using only ⅓ of the parameters — ideal for use in voice agents, assistive technologies, and edge devices.

	> 📖 Read the paper: [Efficient Interleaved Speech Modeling through Knowledge Distillation (arXiv:2506.23670)](https://arxiv.org/abs/2506.23670)
	> 🌐 Demo & samples: [tinywave-landing](https://mohammadmahdinoori.github.io/tinywave-landing/)
	> 💻 Code: [github.com/mohammadmahdinoori/TinyWave](https://github.com/mohammadmahdinoori/TinyWave)

	---

	## 🔧 Model Variants

	\| Model \| Modality \| Tokenizer \| Description \|
	\|--------------------------------------------------------------\|---------------------------\|------------------------\|---------------------------------------------\|
	\| [`tinywave/speech-base-2b`](https://huggingface.co/tinywave/speech-base-2b) \| Speech → Speech \| `spiritlm_base` \| Base phonetic-only speech generation \|
	\| [`tinywave/speech-expressive-2b`](https://huggingface.co/tinywave/speech-expressive-2b) \| Speech → Expressive Speech \| `spiritlm_expressive` \| Includes pitch + style tokens \|
	\| [`tinywave/interleaved-expressive-2b`](https://huggingface.co/tinywave/interleaved-expressive-2b) \| Text ↔ Speech (interleaved) \| `spiritlm_expressive` \| Multimodal expressive generation \|
	\| [`tinywave/expressive-spirit-lm-interleaved-librilight`](https://huggingface.co/tinywave/expressive-spirit-lm-interleaved-librilight) \| Teacher (7B, interleaved) \| `spiritlm_expressive` \| LoRA-corrected SPIRIT-LM for distillation \|