130 likes | 287 Views
Formant Track Restoration in Train Noisy Speech. Qin Yan Communication & Multimedia Signal Processing Group Dept of Electronic & Computer Engineering, Brunel University 25 May, 2004. Restore the formant tracks from the noisy speech. Initial progress of the speech enhancement system.
E N D
Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group Dept of Electronic & Computer Engineering, Brunel University 25 May, 2004
Restore the formant tracks from the noisy speech. Initial progress of the speech enhancement system Main Progress
SNR F1 F2 F3 F4 F5 0 51.3 12.5 6.3 3.7 2.6 5 42 9.7 4.6 2.7 1.8 10 32.3 7.4 3.4 2 1.4 15 23.1 5.8 2.6 1.5 1.1 20 15.6 4.6 2.1 1.2 1 Formant Tracking by 2D HMM in Noise Conditions Table : Average errors (%) of formant tracks in train noisy speech by 2D HMM at different SNR conditions • 2D HMM is not robust to formant tracking in noise conditions
LP Based Formant Tracking Kalman Filter based Formant Tracker LP-based Spectral Subtraction LP Pole Analysis Formant Candidates Selection Noisy Speech Formant tracks Reclassifier Noise Model VAD Figure : Procedure of LP formant Tracking • High LP order is to over-model the LP spectrum to split the poles from formants and noise. • Formant candidate selection rejects spurious candidates. • Kalman filter smoothes formant tracks. • Formant tracks are fed back to reclassification according to the distance to the initial tracks
LP Spectral Subtraction If > other • Noise is modelled by a low LP order but speech is modelled by a high order. • Computation efficiency • Disadvantage : • Noise variance absence. • A hard-decision needs to be employed to avoid the subtracted values going below a noise-floor. • The spectral trajectory across time is not modeled and used in the denoising process.
Performance of LP Spectra Subtraction Figure : Improvement by LP spectra subtraction Note : Improvement is calculated between average frame SNRs as:
Performance I LPC Spectrogram of speech in noisy train (SNR= 0) LPC Spectrogram of Speech in noisy train after spectral subtraction
Kalman Filter “CORRECT” Time Update Equations Measurement Update Equations “PREDICT” • R is the measurement covariance matrix, updated by variance of differences between noisy observation and estimated tracks. • The process matrix Q is set to 0.16 experimentally.
Performance II Table : Average errors (%) of formant tracks in train noisy speechand cleaned speech. Figure : Comparison of clean formant tracks (solid) and cleaned formant tracks (dash dot) and noisy formant tracks (dot).
Initial Speech Enhancement System Initial Speech Enhancement system Enhanced Speech Speech Reconstruction Wiener Filter Kalman Filter based Formant Tracker LP-based Spectral Subtraction LP Pole Analysis Formant Candidates Selection Noisy Speech Formant tracks Reclassifier Noise Model VAD
Speech enhancement with restored formant trajectories Future Work Initial Speech Enhancement system Enhanced Speech Speech Reconstruction Wiener Filter Pitch Track Restoration Residual Kalman Filter based Formant Tracker LP-based Spectral Subtraction LP Pole Analysis Formant Candidates Selection Noisy Speech Formant tracks Reclassifier Noise Model VAD
Speech enhancement with restored formant trajectories Future Work Speech Enhancement System Enhanced Speech Speech Reconstruction Wiener Filter Pitch Track Restoration Residual Kalman Filter based Formant Tracker LP-based Spectral Subtraction LP Pole Analysis Formant Candidates Selection Noisy Speech Formant tracks Reclassifier Noise Model VAD Formant Tracks Restoration System