File size: 819 Bytes
94e1207
 
 
 
 
 
 
 
 
 
bf2c439
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
license: apache-2.0
pipeline_tag: text-to-speech
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
---

This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
- Library: https://github.com/SesameAILabs/csm

## Installation

First install from here:

```bash
git clone -b add_hf https://github.com/NielsRogge/csm.git
cd csm
pip install -r requirements.txt
```

## Usage

```python
import torchaudio
from generator import load_csm_1b

generator = load_csm_1b(device="cuda")

audio = generator.generate(
    text="Hello from Sesame.",
    speaker=0,
    context=[],
    max_audio_length_ms=10_000,
)

torchaudio.save("audio.wav", audio.unsqueeze(0).cpu(), generator.sample_rate)
```