Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone

BGU Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion University of the Negev Workshop on: Speech Enhancement and Multichannel Audio Processing Technion 22.2.2007

BGU Outline • Motivation • Single source pitch estimation and tracking • Multiple source pitch estimation and tracking • Experiments • Conclusion

BGU Motivation • Speech enhancement • Sensitivity of many audio processing algorithms to interference. For example: • Automatic speech/speaker recognition • Speech/music compression • Single microphone blind source separation (BSS) • Karaoke

BGU Single Source - Modeling • Voice frames - harmonic model: additive Gaussian noise • In matrix notation:

BGU Single Source – Pitch Tracking • Maximum Likelihood (ML) estimator: • Pitch tracking: The data vector at the mth frame: - first-order Markov process: • Maximum A-posteriori Probability (MAP) pitch tracking via the Viterbi algorithm. (Tabrikian-Dubnov-Dickalov 2004)

BGU Single Source - Voicing Decision • Unvoiced model • Colored Gaussian noise model: • Voiced/unvoiced decision by the Generalized Likelihood Ratio Test (GLRT): (Fisher-Tabrikian-Dubnov 2006)

BGU Multiple Sources • ML estimator of from under the model: with unknown signal and unknown (Gaussian) noise covariance: (Harmanci-Tabrikian-Krolik 2000)

BGU Multiple Sources • Voiced model: v includes other interferences. is unknown. • Using J overlapping subframes of size Ls (2K+1<J< Ls): jth column of :

BGU Multiple Sources • Pitch tracking: The data vector at the mth frame: - first-order Markov process  Maximum A-posteriori Probability (MAP) pitch tracking via the Viterbi algorithm

BGU Multiple Sources - Voicing Decision • Unvoiced model Colored Gaussian noise model: • Voiced/unvoiced decision by the GLRT: (Fisher-Tabrikian-Dubnov 2007)

BGU Multiple Source Models • Exact ML for the strongest voiced signal, and “locally ML” for other voiced signals Likelihood function

BGU Experiments – Single Source

BGU Experiments - Two Sources

BGU Experiments – Voicing Decision

BGU Experiments - – Voicing Decision

BGU Conclusions • ML pitch estimation for single and multiple sources have been developed under the harmonic model for voiced frames. • The derived likelihood functions under the two models allow implementation of the Viterbi algorithm for MAP pitch tracking. • The GLRT for voicing decision is derived under the two models. • Future work: • development of multiple hypothesis tracking methods for single microphone BSS. • Adaptive estimation of the number of harmonics

Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone