1 / 17

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS. Emad M. Grais. Hakan Erdogan. 17 th International Conference on Digital Signal Processing,2011. Jain-De,Lee. Outline. INTRODUCTION NON-NEGATIVE MATRIX FACTORIZATION

conor
Download Presentation

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS Emad M. Grais Hakan Erdogan 17th International Conference on Digital Signal Processing,2011 Jain-De,Lee

  2. Outline • INTRODUCTION • NON-NEGATIVE MATRIX FACTORIZATION • SIGNAL SEPARATION AND MASKING • EXPERIMENTS AND DISCUSSION • CONCLUSION

  3. Introduction • There are two main stages of this work • Training stage • Separation stage • Using NMF with different types of masks to improve the separation process • The separation process faster • NMF with fewer iterations

  4. Introduction • Problem formulation • The observe a signal x(t) ,which is the mixture of two sources s(t) and m(t) • Assume the sources have the same phase angle as the mixed Where (t , f) be the STFT of x(t) X=S+M

  5. Non-negative Matrix Factorization • Non-negative matrix factorizationalgorithm • Minimization problem • Different cost functionsCof NMF • Euclidean distance • KL divergence subject to elements ofB,W≧0

  6. Non-negative Matrix Factorization • Euclidean distance cost function • KL divergence cost function • Multiplicative Update Algorithm

  7. Non-negative Matrix Factorization • The magnitude spectrogram S and M are calculated by NMF • Larger number of basis vectors • Lower approximation error • Redundant set of basis • Require more computation time

  8. Signal Separation and Masking • The NMF is used decompose the magnitude spectrogram matrix X • The initial spectrograms estimates for speech and music signals are respectively calculated as follows Where WS and WM are submatrices in matrix W

  9. Signal Separation and Masking • Use the initial estimated spectrograms and to build a mask as follows • Source signals reconstruction Where1 is a matrix of ones is element-wisemultiplication

  10. Signal Separation and Masking • Two specific values of p correspond to special masks • Wiener filter(soft mask) • Hard mask

  11. Signal Separation and Masking The value of the mask versus the linear ratio for different values of p

  12. Experiments and Discussion • Simulation • 16kHz sampling rate • Speech • Training speech data-540 short utterances • Testing speech data-20 utterances • Music • 38 pieces for training • 1 piece for testing • Hamming window-512 point • FFT size-512 point

  13. Experiments and Discussion • Performance measurement of the separation

  14. Experiments and Discussion

  15. Experiments and Discussion

  16. Experiments and Discussion

  17. Conclusion • The family of masks have a parameter to control the saturation level • The proposed algorithm gives better results and facilitates to speed up the separation process

More Related