Sampling-frequency-independent (SFI) Conv-TasNet trained with the MUSDB18-HQ dataset for music source separation

This model was proposed in our IEEE/ACM Trans. ASLP paper and works well with untrained sampling frequencies by using sampling-frequency-independent convolutional layers with the time domain filter design. The latent analog filter is a multiphase gammatone filter. It was trained by Tomohiko Nakamura using the codebase). This model was trained with 32 kHz-sampled data but works well with untrained sampling frequencies (e.g., 8, 16 kHz).

License

MIT

Citation

Please cite the following paper.

@article{KSaito2022IEEEACMTASLP,
 author={Saito, Koichi and Nakamura, Tomohiko and Yatabe, Kohei and Saruwatari, Hiroshi},
 journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
 title = {Sampling-frequency-independent convolutional layer and its application to audio source separation},
 year=2022,
 month=sep,
 volume=30,
 pages={2928--2943},
 doi={10.1109/TASLP.2022.3203907},
}

Four trained models (seed=40,42,44,47)
Evaluation results (json files obtained with the museval library)

tnkmr
/

sfi_convtasnet_td_mpgtf_musdb18hq

Sampling-frequency-independent (SFI) Conv-TasNet trained with the MUSDB18-HQ dataset for music source separation

License

Citation

Contents