File size: 2,036 Bytes
f5dd475 97c565c bfc166e f5dd475 97c565c f5dd475 3ac5c08 21dc8b5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
---
title: NotebookLM-Kokoro TTS Project
sdk: docker
app_file: gradio_app.py
pinned: true
---
# NotebookLM-Kokoro TTS Project
This project uses [Kokoro](https://huggingface.co/hexgrad/Kokoro-82M) – a lightweight, open-weight TTS model with 82 million parameters – to create a Google NotebookLM style Text-to-Speech application.
## Why Kokoro?
- **Non-Proprietary & Open-Source:** Kokoro is best in its class as a non-proprietary model, giving you full flexibility to deploy in production environments or personal projects.
- **High Efficiency:** Despite its lightweight architecture, Kokoro delivers comparable quality to larger models while being faster and more cost-efficient.
- **Benchmarks:** According to benchmarks available on the [TTS-Arena](https://huggingface.co/spaces/TTS-AGI/TTS-Arena) page, Kokoro outperforms many closed-source models, making it the ideal choice for open deployments.
- **Easy Integration:** With simple pip and Homebrew installation for dependencies like espeak-ng, integration into Python projects is straightforward.
## Setup Instructions
### Environment Setup
This project uses the **uv** Python package manager. Follow these steps:
1. **Install uv:**
```bash
pip install uv
```
2. **Create a new environment named `notebooklm`:**
```bash
uv venv
```
3. **Activate the environment:**
```bash
source .venv/bin/activate
```
4. **Install Python dependencies:**
```bash
pip install "kokoro>=0.9.2" soundfile torch
```
5. **Install espeak-ng (Mac users):**
```bash
brew install espeak-ng
```
### Running the Application
Once the environment is set up, run the main TTS script as follows:
```bash
python notebook_lm_kokoro.py
```
This will process the transcript text using Kokoro and output audio segments as WAV files.
## Conclusion
Kokoro’s combination of efficiency, quality, and open-access makes it the best non-proprietary TTS model available, as confirmed by recent benchmarks. Enjoy exploring and extending this project!
|