Catalog-based single-channel speech-music separation | Zendy

Cemil  Demir | Zendy; Ali Taylan Cemgil | Zendy; Murat  Saraclar | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Catalog-based single-channel speech-music separation

Author(s) -

Cemil Demir,

Ali Taylan Cemgil,

Murat Saraclar

Publication year - 2010

Language(s) - English

DOI - 10.5072/zenodo.33770

We propose a new catalog-based speech-music separation method for background music removal. Assuming that we know a catalog of the background music, we develop a generative model for the superposed speech and music spectrograms. We represent the speech spectrogram by a Non-negative Matrix Factorization (NMF) model and the music spectrogram by a conditional Poisson Mixture Model (PMM). By choosing the size of the catalog, i.e., the number of mixture components we can tradeoff speed versus accuracy. The combined hierarchical model leads to a mixture of multinomial distributions as the joint posterior of music and speech. Separation and hyperparameter adaptation can be achieved via an Expectation Maximization algorithm. Experimental results show that separation performance of the algorithm is promising. Furthermore, we show that incorporating prior information such as volume adjustment parameter can enhance the separation performance.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research