Sampling-frequency-independent (SFI) Conv-TasNet trained with the MUSDB18-HQ dataset for music source separation

This model was proposed in our IEEE/ACM Trans. ASLP paper and works well with untrained sampling frequencies by using sampling-frequency-independent convolutional layers with the time domain filter design. The latent analog filter is a multiphase gammatone filter. It was trained by Tomohiko Nakamura using the codebase). This model was trained with 32 kHz-sampled data but works well with untrained sampling frequencies (e.g., 8, 16 kHz).

License

MIT

Citation

Please cite the following paper.

@article{KSaito2022IEEEACMTASLP,
 author={Saito, Koichi and Nakamura, Tomohiko and Yatabe, Kohei and Saruwatari, Hiroshi},
 journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
 title = {Sampling-frequency-independent convolutional layer and its application to audio source separation},
 year=2022,
 month=sep,
 volume=30,
 pages={2928--2943},
 doi={10.1109/TASLP.2022.3203907},
}

Contents

  • Four trained models (seed=40,42,44,47)
  • Evaluation results (json files obtained with the museval library)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.