140 likes | 274 Views
Refinement in FTLP-HNM system for Speech Enhancement. Qin Yan Communication & Multimedia Signal Processing Group School of Engineering and Design, Brunel University 23 Nov, 2005. Review of FTLP-HNM system; Parameters estimation of HNM (incl. pitch/harmonic tracking in noise)
E N D
Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing Group School of Engineering and Design, Brunel University 23 Nov, 2005
Review of FTLP-HNM system; Parameters estimation of HNM (incl. pitch/harmonic tracking in noise) Objective results of pitch, harmonic tracking and FTLP-HNM system Demo of enhanced speeches from old archive recordings Outline
Harmonic plus Noise Model In HNM, speech is decomposed to two parts : Harmonic part and noise part. Harmonic : where L(t) denotes the number of harmonic included in the harmonic part, ω0 denotes the pitch frequency. Noise : where h the a time-varying autoregressive(AR) model and b is white Gaussian noise. Synthesized Speech :
HNM - Pitch Tracking • Error function in frequency domain • In noisy condition the error function is modified to including SNR dependent weights The weighting function W(l) is a SNR-dependent given by • NOTE: • The input speech frame is bandpassed to eliminated the parts which don’t contain explicit harmonics. • For Each speech frame, it outputs several pitch candidates (N=3) and Viterbi algorithm then generates the final pitch tracks. • It might be useful to have candidates from this method and traditional autocorrelation method.
Results of Pitch Tracking Figure - Comparison of the performance of different pitch track methods for speech in (a) train noise (b) car noise from 0dB SNR to clean.
HNM - Harmonic Tracking Noise model VAD Smoothed Harmonic Magnitude by Kalman filter Pitch Tracking Noise Speech Harmonic Track Candidates Harmonic Frequency bin tracks Peak picking Tracking FFT • Data structure of harmonic track candidates are improved and speed up the whole system.
Results of Harmonic Tracking in Clean Speech Figure - An illustration of pitch tracks of a speech segment at sampling frequency of 8kHz.
Harmonic Recovery Pitch recovery Results of Harmonic Tracking in Noisy Speech
Synthesis of Excitation by HNM Voiced Excitation : Unvoiced Excitation : Where b(m) is unit white Gaussian noise , e(m) is original excitation and a is the phases of original excitation.
Results of Speech Enhancement Enhanced speech is synthesized by inverse filtering the HNM residual with cleaned LP shape. Figure - Comparison of the harmonicity of MMSE and FTLP-HNM systems on train noisy speech at different SNRs Figure - Performance of MMSE and FTLP-HNM on train noisy speech at different SNR levels.
Demo (1) Persian speech for Iranian King Mozaffareddin Shah Original speech Enhanced speech
Demo (2) Florence Nightinggale 1890 Original speech Enhanced speech