Catalog-based single-channel speech-music separation
Author(s) -
Cemil Demir,
Ali Taylan Cemgil,
Murat Saraclar
Publication year - 2010
Language(s) - English
DOI - 10.5072/zenodo.33770
We propose a new catalog-based speech-music separation method for background music removal. Assuming that we know a catalog of the background music, we develop a generative model for the superposed speech and music spectrograms. We represent the speech spectrogram by a Non-negative Matrix Factorization (NMF) model and the music spectrogram by a conditional Poisson Mixture Model (PMM). By choosing the size of the catalog, i.e., the number of mixture components we can tradeoff speed versus accuracy. The combined hierarchical model leads to a mixture of multinomial distributions as the joint posterior of music and speech. Separation and hyperparameter adaptation can be achieved via an Expectation Maximization algorithm. Experimental results show that separation performance of the algorithm is promising. Furthermore, we show that incorporating prior information such as volume adjustment parameter can enhance the separation performance.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom