File size: 1,525 Bytes
5c3331f 6c25ae7 78dc93d 5c3331f 0362a84 5c3331f 9e053ad 5c3331f 9e053ad 5c3331f 9e053ad 5c3331f 78dc93d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
---
license: apache-2.0
base_model:
- microsoft/wavlm-large
pipeline_tag: audio-to-audio
library_name: torch
datasets:
- mythicinfinity/libritts
---
# ⚡ FocalCodec
A low-bitrate single-codebook 16 kHz speech codec based on [focal modulation](https://arxiv.org/abs/2203.11926).
This repository contains the **12.5 Hz checkpoint** trained on **LibriTTS 960**, as described in the preprint.
- 📜 **Preprint**: https://arxiv.org/abs/2502.04465
- 🌐 **Project Page**: https://lucadellalib.github.io/focalcodec-web/
- 💾 **GitHub**: https://github.com/lucadellalib/focalcodec
<img src="focalcodec.png" width="700">
---------------------------------------------------------------------------------------------------------
## ▶️ Quickstart
See the readme at: https://github.com/lucadellalib/focalcodec
---------------------------------------------------------------------------------------------------------
## @ Citing
```
@article{dellalibera2025focalcodec,
title = {{FocalCodec}: Low-Bitrate Speech Coding via Focal Modulation Networks},
author = {Luca {Della Libera} and Francesco Paissan and Cem Subakan and Mirco Ravanelli},
journal = {arXiv preprint arXiv:2502.04465},
year = {2025},
}
```
---------------------------------------------------------------------------------------------------------
## 📧 Contact
[[email protected]](mailto:[email protected])
--------------------------------------------------------------------------------------------------------- |