README.md · kashif/soundstream_mel_decoder at 5984cba9dc49bc6afbad9a19f8ec0085d88ac1a8

metadata

license: apache-2.0

A SoundStream decoder to reconstruct audio from a mel-spectrogram.

Overview

This model is a SoundStream decoder which inverts mel-spectrograms computed with the specific hyperparameters defined in the example below. This model was trained on music data and used in Multi-instrument Music Synthesis with Spectrogram Diffusion (ISMIR 2022).

A typical use-case is to simplify music generation by predicting mel-spectrograms (instead of a raw waveform), and then use this model to reconstruct audio.

If you use it, please consider citing:

@article{zeghidour2021soundstream,
  title={Soundstream: An end-to-end neural audio codec},
  author={Zeghidour, Neil and Luebs, Alejandro and Omran, Ahmed and Skoglund, Jan and Tagliasacchi, Marco},
  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  volume={30},
  pages={495--507},
  year={2021},
  publisher={IEEE}
}

kashif
/

soundstream_mel_decoder

Overview

Example Use