1 / 20

Advances in WP1

Nancy Meeting – 6-7 July 2006. Advances in WP1. www.loquendo.com. WP1: Environment & Sensor Robustness T1.2 Noise Independence. Noise Reduction: Spectral Subtraction (YEAR 1) and Spectral Attenuation (YEAR2) “Automatic Speech Recognition With a Modified Ephraim-Malah Rule”,

kana
Download Presentation

Advances in WP1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nancy Meeting – 6-7 July 2006 Advances in WP1 www.loquendo.com

  2. WP1: Environment & Sensor RobustnessT1.2 Noise Independence Noise Reduction: • Spectral Subtraction (YEAR 1) and Spectral Attenuation (YEAR2) “Automatic Speech Recognition With a Modified Ephraim-Malah Rule”, Roberto Gemello, Franco Mana and Renato De Mori IEEE Signal Processing Letters, VOL 13, NO 1, January 2006 • Evaluation of HEQ for feature normalization (HEQ study + Revision 2)

  3. Ephraim–Malah MMSE log estimator rule: Denoising Techniques for Y2 evaluations (1) Spectral Attenuation (or spectral weighting) is a form of audio signal enhancement in which noise suppression can be viewed as the application of a suppression rule, or non-negative real-valued gain Gk, to each bin k of the observed signal magnitude spectrum, in order to form an estimate of the original signal magnitude spectrum.

  4. Modified Ephraim–Malah MMSE log estimator rule: Denoising Techniques for Y2 evaluations (2) We propose to make the estimation of the a priori and the a posteriori SNR dependent on the noise overestimation factor a(m) and the spectral floor b(m) as follows:

  5. Denoising Techniques for Y2 evaluations (3) The noise spectrum amplitude is obtained by a first-order recursion in conjunction with an energy based Voice Activity Detector (VAD) as follows: Where:  controls the update speed of the recursion (0.9),  controls the allowed dynamics of noise (4.0), and the noise standard deviation (m) is estimated as:

  6. Baseline evaluations of Loquendo ASR on Aurora2 speech databases

  7. Year 1+2 Performance evaluations The testing conditions used in the experiments are the following: 1) No Denoising (ND): Rasta PLP features (RPLP) are used without any preliminary noise reduction. 2) Wiener modified (WM): RPLP with Wiener filtering dependent on global SNR. 3) Ephraim-Malah modified (EMM): RPLP with noise reduction based on the modified Ephraim-Malah spectral attenuation rule.

  8. Baseline evaluations of Loquendo ASR on Aurora3 speech databases

  9. Year 1+2 Performance evaluations The testing conditions used in the experiments are the following: 1) No Denoising (ND): Rasta PLP features (RPLP) are used without any preliminary noise reduction. 2) Wiener modified (WM): RPLP with Wiener filtering dependent on global SNR. 3) Ephraim-Malah modified (EMM): RPLP with noise reduction based on the modified Ephraim-Malah spectral attenuation rule.

  10. Baseline evaluations of Loquendo ASR on Aurora4 speech databases

  11. Year 1+2 Performance evaluations The testing conditions used in the experiments are the following: 1) No Denoising (ND): Rasta PLP features (RPLP) are used without any preliminary noise reduction. 2) Wiener modified (WM): RPLP with Wiener filtering dependent on global SNR. 3) Ephraim-Malah modified (EMM): RPLP with noise reduction based on the modified Ephraim-Malah spectral attenuation rule.

  12. Year 1+2 Performance evaluations The testing conditions used in the experiments are the following: 1) No Denoising (ND): Rasta PLP features (RPLP) are used without any preliminary noise reduction. 2) Wiener modified (WM): RPLP with Wiener filtering dependent on global SNR. 3) Ephraim-Malah modified (EMM): RPLP with noise reduction based on the modified Ephraim-Malah spectral attenuation rule.

  13. HEQ + Denoising techniques

  14. E+12CEP DE+12DEP DDE+12DDEP (39 coefficients) HEQ (121) HEQ Evaluation: Revision 1 (1)(Loquendo & UGR) Problems: (1) Context dependency (whole utterance CDF estimation the best) (2) High variability in background noise segment

  15. UGR HEQ Loquendo ASR Loquendo FE HEQ Integration: Revision 1 (2)(Loquendo & UGR) Phoneme-based Models Feature Normalization (Frame -39coeff- level) Denoise (Power Spectrum level)

  16. HEQ Evaluation: Revision 2 (3)(Loquendo & UGR) HEQ (1573) E+12CEP DE+12DEP DDE+12DDEP (39 coefficients) HEQ (1573) HEQ (1573) Benefits: (1) Relation in magnitude and dynamics among coefficients are preserved (2) More stable CDF estimation similar to extend the HEQ temporal window

  17. HEQ Evaluation: Revision 2 (4)(Loquendo & UGR)

  18. HEQ for denoising (5)(Loquendo & UGR) Comparing RPLP / HEQrev1 / HEQrev2 using the same clean and noisy signal

  19. HEQ for signal level equalization (6)(Loquendo & UGR) Comparing RPLP / HEQrev1 / HEQrev2 using the same clean signal at normal gain level and at low gain level

  20. WP1: Workplan • Selection of suitable benchmark databases; (m6) • Completion of LASR baseline experimentation of Spectral Subtraction (Wiener SNR dependent) (m12) • Discriminative VAD (training+AURORA3 testing) (m16) • Exprimentation of Spectral Attenuation rule (Ephraim-Malah SNR dependent) (m21) • Preliminary results on spectral subtraction and HEQ techniques (m24) • Integration of denoising and normalization techniques (m33) • Noise estimation and reduction for non-stationary noises (m33)

More Related