Spaces:
Running
Running
title: Tts Api | |
emoji: π | |
colorFrom: indigo | |
colorTo: yellow | |
sdk: docker | |
pinned: false | |
# TTS API | |
A FastAPI-based Text-to-Speech API using XTTS-v2 for voice cloning. | |
## Features | |
- Convert text to speech using voice cloning | |
- Upload reference speaker audio files | |
- Support for multiple languages | |
- RESTful API with automatic documentation | |
- Docker support | |
## Setup | |
### Local Development | |
1. Install dependencies: | |
```bash | |
pip install -r requirements.txt | |
``` | |
2. Run the API: | |
```bash | |
python app.py | |
``` | |
The API will be available at `http://localhost:8000` | |
### Using Docker | |
1. Build the Docker image: | |
```bash | |
docker build -t tts-api . | |
``` | |
2. Run the container: | |
```bash | |
docker run -p 8000:8000 tts-api | |
``` | |
## API Endpoints | |
### Health Check | |
- **GET** `/health` - Check API status | |
### Text-to-Speech | |
- **POST** `/tts` - Convert text to speech with uploaded speaker file | |
- **Parameters:** | |
- `text` (form): Text to convert to speech | |
- `language` (form): Language code (default: "en") | |
- `speaker_file` (file): Reference speaker audio file | |
### API Documentation | |
- **GET** `/docs` - Interactive API documentation (Swagger UI) | |
- **GET** `/redoc` - Alternative API documentation | |
## Usage Examples | |
### Using Python requests | |
```python | |
import requests | |
# Prepare the request | |
url = "http://localhost:8000/tts" | |
data = { | |
"text": "Hello, this is a test of voice cloning!", | |
"language": "en" | |
} | |
files = { | |
"speaker_file": open("path/to/speaker.wav", "rb") | |
} | |
# Make the request | |
response = requests.post(url, data=data, files=files) | |
# Save the generated audio | |
if response.status_code == 200: | |
with open("output.wav", "wb") as f: | |
f.write(response.content) | |
print("Speech generated successfully!") | |
``` | |
### Using curl | |
```bash | |
curl -X POST "http://localhost:8000/tts" \ | |
-F "text=Hello, this is a test!" \ | |
-F "language=en" \ | |
-F "speaker_file=@path/to/speaker.wav" \ | |
--output generated_speech.wav | |
``` | |
### Using the provided client example | |
```bash | |
python client_example.py | |
``` | |
## Requirements | |
- Python 3.8+ | |
- CUDA-compatible GPU (recommended for faster processing) | |
- Audio file in supported format (WAV, MP3, etc.) for speaker reference | |
## Model | |
This API uses the XTTS-v2_C3PO model for voice cloning, which is automatically downloaded when building the Docker image. | |