Spaces:
Running
Running
zach
commited on
Commit
·
87ff28a
1
Parent(s):
6130461
Add project README
Browse files
README.md
ADDED
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
<div align="center">
|
2 |
+
<img src="https://storage.googleapis.com/hume-public-logos/hume/hume-banner.png">
|
3 |
+
<h1>Expressive TTS Arena</h1>
|
4 |
+
<p>
|
5 |
+
<strong>An interactive platform for comparing and evaluating the expressiveness of different text-to-speech engines</strong>
|
6 |
+
</p>
|
7 |
+
</div>
|
8 |
+
|
9 |
+
## Overview
|
10 |
+
Expressive TTS Arena is an open-source web application that enables users to compare text-to-speech outputs with a focus on expressiveness rather than just audio quality. Built with Gradio, it provides a seamless interface for generating and comparing speech synthesis from different providers, including Hume and ElevenLabs.
|
11 |
+
|
12 |
+
## Features
|
13 |
+
- Text generation using Claude AI for creating expressive content
|
14 |
+
- Direct text input or AI-assisted text generation
|
15 |
+
- Comparative analysis of different TTS engines
|
16 |
+
- Simple voting mechanism for preferred outputs
|
17 |
+
- Random voice selection from multiple providers
|
18 |
+
- Real-time speech synthesis comparison
|
19 |
+
|
20 |
+
## Prerequisites
|
21 |
+
|
22 |
+
- Python >=3.11.11
|
23 |
+
- Virtual environment capability
|
24 |
+
- API keys for Hume AI, Anthropic, and ElevenLabs
|
25 |
+
|
26 |
+
### Installation
|
27 |
+
|
28 |
+
1. Create and activate the virtual environment:
|
29 |
+
|
30 |
+
```sh
|
31 |
+
python -m venv gradio-env
|
32 |
+
source gradio-env/bin/activate # On Windows, use: gradio-env\Scripts\activate
|
33 |
+
```
|
34 |
+
|
35 |
+
2. Install dependencies:
|
36 |
+
|
37 |
+
```sh
|
38 |
+
pip install -r requirements.txt
|
39 |
+
```
|
40 |
+
|
41 |
+
3. Configure environment variables:
|
42 |
+
- Create a `.env` file based on `.env.example`
|
43 |
+
- Add your API keys:
|
44 |
+
|
45 |
+
```sh
|
46 |
+
HUME_API_KEY=YOUR_HUME_API_KEY
|
47 |
+
ANTHROPIC_API_KEY=YOUR_ANTHROPIC_API_KEY
|
48 |
+
ELEVENLABS_API_KEY=YOUR_ELEVENLABS_API_KEY
|
49 |
+
```
|
50 |
+
|
51 |
+
4. Run the application:
|
52 |
+
|
53 |
+
```sh
|
54 |
+
watchfiles "python -m src.app"`
|
55 |
+
```
|
56 |
+
|
57 |
+
## User Flow
|
58 |
+
|
59 |
+
1. **Enter or Generate Text:** Type directly in the Text box, or optionally enter a Prompt, click "Generate text", and edit if needed.
|
60 |
+
2. **Synthesize Speech:** Click "Synthesize speech" to generate two audio outputs.
|
61 |
+
3. **Listen & Compare:** Playback both options (A & B) to hear the differences.
|
62 |
+
4. **Vote for Your Favorite:** Click "Vote for option A" or "Vote for option B" to choose your favorite.
|
63 |
+
|
64 |
+
## Contributing
|
65 |
+
We welcome contributions to the Expressive TTS Arena! This project is intended to serve as example code and is open-source. Feel free to submit issues, fork the repository, and create pull requests for any improvements.
|
66 |
+
|
67 |
+
## License
|
68 |
+
[Add your chosen license here]
|