100 likes | 192 Views
Report of Work on Formant Tracking LP Models and Plans on Integration with Harmonic Plus Noise Model. Qin Yan Communication & Multimedia Signal Processing Group Dept of Electronic & Computer Engineering, Brunel University 14 Feb, 2005.
E N D
Report of Work on Formant Tracking LP Models and Plans on Integration with Harmonic Plus Noise Model Qin Yan Communication & Multimedia Signal Processing Group Dept of Electronic & Computer Engineering, Brunel University 14 Feb, 2005
Parallel formant synthesizer vs Cascade formant synthesizer MMSE based Pre-cleaning vs LPSS based Pre-cleaning for formant tracking Plan of integration with Harmonic Noise Model (HNM) Outline
Parallel Formant Synthesiser I Figure - Klatt synthesizer • Weakness : zeros(troughs) in the overall response of the synthesizer and hard to tuning and control. • Strength : Individual gain Mi for each formant Fi
Iterative optimization process is employed to control the magnitudes of formants. Note: Mi is different from Moi. . Threshold is |Mmodoi– Moi|<0.5dB Parallel Formant Synthesiser II Original Freq Response H Iterative Optimized Freq Response Hmod Original Freq Response H Mmodoi Moi Individual Filter Freq Response Hi Moi Mi
Weakness : only one gain term M for all formants. Hard to adjust magnitude of individual formants. Strength: Overall response is always an all-pole filter even after modifications. No zeros or troughs. Adjustment of magnitudes of individual formant can only be achieved via modification of the bandwidth --- an iterative optimization is required to obtain the required changes between filter parameters. Eg. Decrease Bi Increase Mi ; Increase Bi Decrease Mi. . Cascade Formant Synthesizer with Adjusted Formant Magnitudes Performance of cascade formant synthesizer with adjusted formant magnitude
MMSE based Pre-cleaning I Figure - Performance comparison of LPSS and MMSE on car noisy speech. • MMSE gives better performance in both segmental and global SNR compared with LPSS. • NOTE: In both cases SNR is calculated in FFT domain rather than LP domain.
MMSE based Pre-cleaning II Figure: Average % error of formant tracks of speech in train noise and cleaned speech using spectral subtraction and Kalman filters, the results were averaged over five males. • MMSE is better in all the formants than LPSS. • MMSE+Kalman presents better performance than LPSS+Kalman in lower formants but not in higher formants.
Future Work • Cleaning of the speech excitation --- Using harmonic and noise model (HNM) to model the speech excitation • HNM based clean speech synthesizer. • Pitch tracking in noise conditions. • Maximum voiced frequency estimation. • HNM based speech/excitation enhancement.