200 likes | 353 Views
Nancy Meeting – 6-7 July 2006. Advances in WP1. www.loquendo.com. WP1: Environment & Sensor Robustness T1.2 Noise Independence. Noise Reduction: Spectral Subtraction (YEAR 1) and Spectral Attenuation (YEAR2) “Automatic Speech Recognition With a Modified Ephraim-Malah Rule”,
E N D
Nancy Meeting – 6-7 July 2006 Advances in WP1 www.loquendo.com
WP1: Environment & Sensor RobustnessT1.2 Noise Independence Noise Reduction: • Spectral Subtraction (YEAR 1) and Spectral Attenuation (YEAR2) “Automatic Speech Recognition With a Modified Ephraim-Malah Rule”, Roberto Gemello, Franco Mana and Renato De Mori IEEE Signal Processing Letters, VOL 13, NO 1, January 2006 • Evaluation of HEQ for feature normalization (HEQ study + Revision 2)
Ephraim–Malah MMSE log estimator rule: Denoising Techniques for Y2 evaluations (1) Spectral Attenuation (or spectral weighting) is a form of audio signal enhancement in which noise suppression can be viewed as the application of a suppression rule, or non-negative real-valued gain Gk, to each bin k of the observed signal magnitude spectrum, in order to form an estimate of the original signal magnitude spectrum.
Modified Ephraim–Malah MMSE log estimator rule: Denoising Techniques for Y2 evaluations (2) We propose to make the estimation of the a priori and the a posteriori SNR dependent on the noise overestimation factor a(m) and the spectral floor b(m) as follows:
Denoising Techniques for Y2 evaluations (3) The noise spectrum amplitude is obtained by a first-order recursion in conjunction with an energy based Voice Activity Detector (VAD) as follows: Where: controls the update speed of the recursion (0.9), controls the allowed dynamics of noise (4.0), and the noise standard deviation (m) is estimated as:
Baseline evaluations of Loquendo ASR on Aurora2 speech databases
Year 1+2 Performance evaluations The testing conditions used in the experiments are the following: 1) No Denoising (ND): Rasta PLP features (RPLP) are used without any preliminary noise reduction. 2) Wiener modified (WM): RPLP with Wiener filtering dependent on global SNR. 3) Ephraim-Malah modified (EMM): RPLP with noise reduction based on the modified Ephraim-Malah spectral attenuation rule.
Baseline evaluations of Loquendo ASR on Aurora3 speech databases
Year 1+2 Performance evaluations The testing conditions used in the experiments are the following: 1) No Denoising (ND): Rasta PLP features (RPLP) are used without any preliminary noise reduction. 2) Wiener modified (WM): RPLP with Wiener filtering dependent on global SNR. 3) Ephraim-Malah modified (EMM): RPLP with noise reduction based on the modified Ephraim-Malah spectral attenuation rule.
Baseline evaluations of Loquendo ASR on Aurora4 speech databases
Year 1+2 Performance evaluations The testing conditions used in the experiments are the following: 1) No Denoising (ND): Rasta PLP features (RPLP) are used without any preliminary noise reduction. 2) Wiener modified (WM): RPLP with Wiener filtering dependent on global SNR. 3) Ephraim-Malah modified (EMM): RPLP with noise reduction based on the modified Ephraim-Malah spectral attenuation rule.
Year 1+2 Performance evaluations The testing conditions used in the experiments are the following: 1) No Denoising (ND): Rasta PLP features (RPLP) are used without any preliminary noise reduction. 2) Wiener modified (WM): RPLP with Wiener filtering dependent on global SNR. 3) Ephraim-Malah modified (EMM): RPLP with noise reduction based on the modified Ephraim-Malah spectral attenuation rule.
E+12CEP DE+12DEP DDE+12DDEP (39 coefficients) HEQ (121) HEQ Evaluation: Revision 1 (1)(Loquendo & UGR) Problems: (1) Context dependency (whole utterance CDF estimation the best) (2) High variability in background noise segment
UGR HEQ Loquendo ASR Loquendo FE HEQ Integration: Revision 1 (2)(Loquendo & UGR) Phoneme-based Models Feature Normalization (Frame -39coeff- level) Denoise (Power Spectrum level)
HEQ Evaluation: Revision 2 (3)(Loquendo & UGR) HEQ (1573) E+12CEP DE+12DEP DDE+12DDEP (39 coefficients) HEQ (1573) HEQ (1573) Benefits: (1) Relation in magnitude and dynamics among coefficients are preserved (2) More stable CDF estimation similar to extend the HEQ temporal window
HEQ for denoising (5)(Loquendo & UGR) Comparing RPLP / HEQrev1 / HEQrev2 using the same clean and noisy signal
HEQ for signal level equalization (6)(Loquendo & UGR) Comparing RPLP / HEQrev1 / HEQrev2 using the same clean signal at normal gain level and at low gain level
WP1: Workplan • Selection of suitable benchmark databases; (m6) • Completion of LASR baseline experimentation of Spectral Subtraction (Wiener SNR dependent) (m12) • Discriminative VAD (training+AURORA3 testing) (m16) • Exprimentation of Spectral Attenuation rule (Ephraim-Malah SNR dependent) (m21) • Preliminary results on spectral subtraction and HEQ techniques (m24) • Integration of denoising and normalization techniques (m33) • Noise estimation and reduction for non-stationary noises (m33)