50 likes | 221 Views
• Joseph Picone 1 Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi State University • Contact Information: Box 9571 Mississippi State University Mississippi State. Mississippi 39762 Tel: 662-325-3149 Fax: 662-325-2298
E N D
• Joseph Picone1 Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi State University • Contact Information: Box 9571 Mississippi State University Mississippi State. Mississippi 39762 Tel: 662-325-3149 Fax: 662-325-2298 Email:picone@isip.msstate.edu IMPROVING RECOGNITION PERFORMANCE IN NOISY ENVIRONMENTS CLSP SUMMER PLANNING WORKSHOP 1. Three-time workshop survivor (’97-’99)!
OVERVIEW • Client/server applications • Evaluate robustness in noisy environments • Propose a standard for LVCSR applications AURORA LVCSR EVALUATION • WSJ 5K (closed task) with seven (digitally-added) noise conditions • Common ASR system • Two participants: QIO: QualC., ICSI, OGI; MFA: Moto., FrTel., Alcatel
STATE OF THE ART • Commercial front ends use adaptive noise compensation: • Advanced front ends use a variety of techniques including subspace methods, normalization, and multiple time scales: ADAPTIVE SIGNAL PROCESSING • Aurora LVCSR eval did not address acoustic modeling issues and speaker/channel adaptation (by design).
PROPOSAL SUMMARY SIGNAL PROCESSING VS. ACOUSTIC MODELS • Focus on Aurora task (TS2): • multiple microphones; representative noise conditions • adaptation/multipass processing within a single utterance • establish benchmarks prior to workshop (incl. adaptation) • Parallel research tracks: • noise robust front end processing • phone/state-specific features and/or noise models • Some possible themes: • knowledge vs. statistics • phone-dependent spectral models of speech and noise • multi-time scale analysis • subspace methods to separate speech and noise • iterative refinement
REFERENCES AURORA AND ICSLP’2002 • J. Picone, "Improving Speech Recognition Performance in Noisy Environments,” Mississippi State University, November 8, 2002 (http://www.isip.msstate.edu/publications/seminars/2002/clsp_pm/). • N. Parihar and J. Picone, “DSR Front End LVCSR Evaluation – Baseline Recognition System Description,” Aurora Working Group, European Telecommunications Standards Institute, November 1, 2001 (http://www.isip.msstate.edu/publications/reports/aurora_frontend/2001). • D. Machola, et al, “Evaluation of a Noise-Robust DSR Front End on Aurora Databases,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp. 17-20, September 2002. • A. Adamia, et al, “Qualcomm-ICSI-OGI Features For ASR,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp. 21-24, September 2002. • C.P. Chen, et al, “Front End Post-Processing and Back End Model Enhancement on the Aurora 2.0/3.0 Databases,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp. 241-244, September 2002. • P. Mot´ý¡cek and L. Burget, “Noise Estimation For Efficient Speech Enhancement and Robust Speech Recognition,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp. 1033-1036, September 2002. • J. Chen, et al, “Recognition of Noisy Speech Using Normalized Moments,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp. 2441-2444, September 2002. • J. Wu and Q. Huo, “An Environment Compensated Minimum Classification Error Training Approach and Its Evaluation in Aurora 2 Database,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp. 453-456, September 2002. • G. Saon and J.M. Huerta, “Improvements to the IBM Aurora 2 Multi-Condition System,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp. 469-4472, September 2002.