110 likes | 238 Views
Turin Meeting – 9-10 March 2006. Advances in WP1. www.loquendo.com. WP1: Environment & Sensor Robustness T1.2 Noise Independence. Voice Activity Detection: Model based approach using NN “Non-linear estimation of voice activity to improve automatic recognition of noisy speech”,
E N D
Turin Meeting – 9-10 March 2006 Advances in WP1 www.loquendo.com
WP1: Environment & Sensor RobustnessT1.2 Noise Independence • Voice Activity Detection: • Model based approach using NN “Non-linear estimation of voice activity to improve automatic recognition of noisy speech”, Roberto Gemello, Franco Mana and Renato De Mori Eurospeech 2005, Lisboa, September 2005 • Noise Reduction: • Spectral Subtraction (standard, Wiener and SNR dependent) and Spectral Attenuation (Ephraim-Malah SA standard and SNR dependent) “Automatic Speech Recognition With a Modified Ephraim-Malah Rule”, Roberto Gemello, Franco Mana and Renato De Mori IEEE Signal Processing Letters, VOL 13, NO 1, January 2006 • Evaluation of HEQ for feature normalization • New techniques for non-stationary noises
Baseline evaluations of Loquendo ASR on Aurora2 speech databases
Baseline Performance evaluations Performances in terms of Word Accuracy and (Error Reduction)
Baseline evaluations of Loquendo ASR on Aurora3 speech databases
Baseline Performance evaluations Performances in terms of Word Accuracy and (Error Reduction)
Baseline evaluations of Loquendo ASR on Aurora4 speech databases(to be done)
HEQ Evaluation (1) (Loquendo & UGR) The HEQ algorithm introduces an amplification of the coefficient (energy in this case) in the background noise audio segment.
HEQ Evaluation (2)(Loquendo & UGR) The HEQ algorithm introduces a context dependent normalization. This could be a drawback for open-vocabulary recognizer where phoneme based acoustic models are used.
Loquendo FE UGR HEQ Loquendo ASR HEQ Integration (3)(Loquendo & UGR) Phoneme-based Models Denoise (Power Spectrum level) Feature Normalization (Frame -39coeff- level)
WP1: Workplan • Selection of suitable benchmark databases; (m6) • Completion of LASR baseline experimentation of Spectral Subtraction (Wiener SNR dependent) (m12) • Discriminative VAD (training+AURORA3 testing) (m16) • Exprimentation of Spectral Attenuation rule (Ephraim-Malah SNR dependent) (m21) • Integration of denoising and normalization techniques (m33) • Noise estimation and reduction for non-stationary noises (m33)