Spaces:
Sleeping
title: XTTS C3PO Voice Cloning API
emoji: 🤖
colorFrom: indigo
colorTo: yellow
sdk: docker
pinned: false
XTTS C3PO Voice Cloning API
A FastAPI-based Text-to-Speech API using XTTS-v2 with the iconic C3PO voice from Star Wars.
Features
- C3PO Voice: Pre-loaded with the iconic C3PO voice from Star Wars
- Custom Voice Cloning: Upload your own reference audio for voice cloning
- Multilingual Support: 16+ languages with C3PO voice
- No Upload Required: Use C3PO voice without any file uploads
- RESTful API: Clean API with automatic documentation
- Docker Support: Optimized for Hugging Face Spaces deployment
- PyTorch 2.6 Compatible: Includes compatibility fixes
About the C3PO Model
This API uses the XTTS-v2 C3PO model from Borcherding/XTTS-v2_C3PO, which provides the iconic voice of C-3PO from Star Wars. The model supports:
- High-quality C3PO voice synthesis
- Multilingual C3PO speech (16+ languages)
- Custom voice cloning capabilities
- Real-time speech generation
Quick Start
Using C3PO Voice (No Upload Required)
curl -X POST "http://localhost:7860/tts-c3po" \
-F "text=Hello there! I am C-3PO, human-cyborg relations." \
-F "language=en" \
--output c3po_speech.wav
Using Custom Voice Cloning
curl -X POST "http://localhost:7860/tts" \
-F "text=This will be spoken in your custom voice!" \
-F "language=en" \
-F "speaker_file=@your_reference_voice.wav" \
--output custom_speech.wav
API Endpoints
C3PO Voice Only
- POST
/tts-c3po- Generate speech using C3PO voice (no file upload needed)- Parameters:
text(form): Text to convert to speech (max 500 characters)language(form): Language code (default: "en")no_lang_auto_detect(form): Disable automatic language detection
- Parameters:
Voice Cloning with Fallback
- POST
/tts- Convert text to speech with optional custom voice- Parameters:
text(form): Text to convert to speech (max 500 characters)language(form): Language code (default: "en")voice_cleanup(form): Apply audio cleanup to reference voiceno_lang_auto_detect(form): Disable automatic language detectionspeaker_file(file, optional): Reference speaker audio file (uses C3PO if not provided)
- Parameters:
JSON API
- POST
/tts-json- Convert text to speech using JSON request body- Body: JSON object with
text,language,voice_cleanup,no_lang_auto_detect - File:
speaker_file(optional) - Reference speaker audio file
- Body: JSON object with
Information Endpoints
- GET
/health- Check API status, device info, and supported languages - GET
/languages- Get list of supported languages - GET
/docs- Interactive API documentation (Swagger UI)
Usage Examples
Python - C3PO Voice
import requests
# Generate C3PO speech
url = "http://localhost:7860/tts-c3po"
data = {
"text": "Hello there! I am C-3PO, human-cyborg relations.",
"language": "en"
}
response = requests.post(url, data=data)
if response.status_code == 200:
with open("c3po_speech.wav", "wb") as f:
f.write(response.content)
print("C3PO speech generated!")
Python - Custom Voice with C3PO Fallback
import requests
url = "http://localhost:7860/tts"
data = {
"text": "This will use C3PO voice if no speaker file is provided.",
"language": "en"
}
# No speaker_file provided - will use C3PO voice
response = requests.post(url, data=data)
if response.status_code == 200:
with open("speech_output.wav", "wb") as f:
f.write(response.content)
Multilingual C3PO
# C3PO speaking Spanish
data = {
"text": "Hola, soy C-3PO. Domino más de seis millones de formas de comunicación.",
"language": "es"
}
response = requests.post("http://localhost:7860/tts-c3po", data=data)
Supported Languages
The C3PO model supports all XTTS-v2 languages:
- en - English
- es - Spanish
- fr - French
- de - German
- it - Italian
- pt - Portuguese (Brazilian)
- pl - Polish
- tr - Turkish
- ru - Russian
- nl - Dutch
- cs - Czech
- ar - Arabic
- zh-cn - Mandarin Chinese
- ja - Japanese
- ko - Korean
- hu - Hungarian
- hi - Hindi
Setup
Hugging Face Spaces Deployment
This API is optimized for Hugging Face Spaces with:
- Automatic C3PO model downloading
- Proper user permissions (user ID 1000)
- PyTorch 2.6 compatibility fixes
- COQUI license agreement handling
Local Development
- Install system dependencies:
# Ubuntu/Debian
sudo apt-get install espeak-ng ffmpeg git git-lfs
# macOS
brew install espeak ffmpeg git git-lfs
- Install Python dependencies:
pip install -r requirements.txt
python -m unidic download
- Clone C3PO model (optional - auto-downloaded on first run):
git clone https://huggingface.co/Borcherding/XTTS-v2_C3PO XTTS-v2_C3PO
- Run the API:
uvicorn app:app --host 0.0.0.0 --port 7860
Using Docker
# Build and run
docker build -t xtts-c3po-api .
docker run -p 7860:7860 xtts-c3po-api
Reference Audio Guidelines
For custom voice cloning:
- Duration: 3-10 seconds of clear speech
- Quality: High-quality audio, minimal background noise
- Format: WAV format recommended (MP3, M4A also supported)
- Content: Natural speech, avoid music or effects
- Speaker: Single speaker, clear pronunciation
Model Information
- Base Model: XTTS-v2
- Voice: C3PO from Star Wars
- Source: Borcherding/XTTS-v2_C3PO
- Languages: 16+ supported
- License: CPML (Coqui Public Model License)
Testing
Run the test suite:
# Test C3PO model functionality
python test.py
# Test API endpoints
python client_example.py
Environment Variables
Automatically configured:
COQUI_TOS_AGREED=1- Agrees to CPML licenseNUMBA_DISABLE_JIT=1- Disables Numba JIT compilation
API Response Examples
Health Check Response
{
"status": "healthy",
"device": "cuda",
"model": "XTTS-v2 C3PO",
"default_voice": "C3PO",
"supported_languages": ["en", "es", "fr", ...]
}
Languages Response
{
"languages": ["en", "es", "fr", "de", "it", "pt", "pl", "tr", "ru", "nl", "cs", "ar", "zh-cn", "ja", "ko", "hu", "hi"]
}
Troubleshooting
PyTorch Loading Issues
The API includes fixes for PyTorch 2.6's weights_only=True default. If you encounter loading issues, ensure the compatibility fix is applied.
Model Download Issues
If the C3PO model fails to download:
- Check internet connection
- Verify git and git-lfs are installed
- Manually clone:
git clone https://huggingface.co/Borcherding/XTTS-v2_C3PO XTTS-v2_C3PO
Audio Quality Issues
- Use high-quality reference audio for custom voices
- Enable
voice_cleanupfor noisy reference audio - Ensure reference audio is 3-10 seconds long
Memory Issues
- Use CPU mode for lower memory usage: set
CUDA_VISIBLE_DEVICES="" - Reduce text length for batch processing
- Consider using GPU with sufficient VRAM (4GB+ recommended)
License
This project uses XTTS-v2 which is licensed under the Coqui Public Model License (CPML). The C3PO model is provided by the community. See https://coqui.ai/cpml for license details.
Credits
- XTTS-v2: Coqui AI
- C3PO Model: Borcherding
- Original Character: C-3PO from Star Wars (Lucasfilm/Disney)