|
--- |
|
license: apache-2.0 |
|
pipeline_tag: text-to-speech |
|
tags: |
|
- model_hub_mixin |
|
- pytorch_model_hub_mixin |
|
--- |
|
|
|
This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration: |
|
- Library: https://github.com/SesameAILabs/csm |
|
|
|
## Installation |
|
|
|
First install from here: |
|
|
|
```bash |
|
git clone -b add_hf https://github.com/NielsRogge/csm.git |
|
cd csm |
|
pip install -r requirements.txt |
|
``` |
|
|
|
## Usage |
|
|
|
```python |
|
import torchaudio |
|
from generator import load_csm_1b |
|
|
|
generator = load_csm_1b(device="cuda") |
|
|
|
audio = generator.generate( |
|
text="Hello from Sesame.", |
|
speaker=0, |
|
context=[], |
|
max_audio_length_ms=10_000, |
|
) |
|
|
|
torchaudio.save("audio.wav", audio.unsqueeze(0).cpu(), generator.sample_rate) |
|
``` |