Spaces:

mrfakename
/

E2-F5-TTS

Running on Zero

mrfakename commited on Apr 3

Commit

083a2d9

verified ·

1 Parent(s): fbe6497

Sync from GitHub repo

This Space is synced from the GitHub repo: https://github.com/SWivid/F5-TTS. Please submit contributions to the Space there

Files changed (3) hide show

README_REPO.md CHANGED Viewed

@@ -114,9 +114,11 @@ Deployment solution with Triton and TensorRT-LLM.
 #### Benchmark Results
 Decoding on a single L20 GPU, using 26 different prompt_audio & target_text pairs.
-| Model | Concurrency | Avg Latency    | RTF   |
-|-------|-------------|----------------|-------|
-| F5-TTS Base (Vocos) | 1     | 253 ms | 0.0394|
 See [detailed instructions](src/f5_tts/runtime/triton_trtllm/README.md) for more information.

 #### Benchmark Results
 Decoding on a single L20 GPU, using 26 different prompt_audio & target_text pairs.
+| Model               | Concurrency    | Avg Latency | RTF    | Mode            |
+|---------------------|----------------|-------------|--------|-----------------|
+| F5-TTS Base (Vocos) | 2              | 253 ms      | 0.0394 | Client-Server   |
+| F5-TTS Base (Vocos) | 1 (Batch_size) | -           | 0.0402 | Offline TRT-LLM |
+| F5-TTS Base (Vocos) | 1 (Batch_size) | -           | 0.1467 | Offline Pytorch |
 See [detailed instructions](src/f5_tts/runtime/triton_trtllm/README.md) for more information.

pyproject.toml CHANGED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "f5-tts"
-version = "1.1.0"
 description = "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
 readme = "README.md"
 license = {text = "MIT License"}

 [project]
 name = "f5-tts"
+version = "1.1.1"
 description = "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
 readme = "README.md"
 license = {text = "MIT License"}

src/f5_tts/runtime/triton_trtllm/README.md CHANGED Viewed

@@ -59,11 +59,11 @@ benchmark.py --output-dir $log_dir \
 ### Benchmark Results
 Decoding on a single L20 GPU, using 26 different prompt_audio/target_text pairs.
-| Model | Concurrency | Avg Latency    | RTF   | Mode |
-|-------|-------------|----------------|-------|------|
-| F5-TTS Base (Vocos) | 2     | 253 ms | 0.0394|Client-Server|
-| F5-TTS Base (Vocos) | 1 (Batch_size)     | - | 0.0402|Offline TRT-LLM|
-| F5-TTS Base (Vocos) | 1 (Batch_size)    | - | 0.1467|Offline Pytorch|
 ### Credits
 1. [F5-TTS-TRTLLM](https://github.com/Bigfishering/f5-tts-trtllm)

 ### Benchmark Results
 Decoding on a single L20 GPU, using 26 different prompt_audio/target_text pairs.
+| Model               | Concurrency    | Avg Latency | RTF    | Mode            |
+|---------------------|----------------|-------------|--------|-----------------|
+| F5-TTS Base (Vocos) | 2              | 253 ms      | 0.0394 | Client-Server   |
+| F5-TTS Base (Vocos) | 1 (Batch_size) | -           | 0.0402 | Offline TRT-LLM |
+| F5-TTS Base (Vocos) | 1 (Batch_size) | -           | 0.1467 | Offline Pytorch |
 ### Credits
 1. [F5-TTS-TRTLLM](https://github.com/Bigfishering/f5-tts-trtllm)