180 likes | 197 Views
Liverpool. Keele Contribution. Task 1.4. Envelope information and binaural processing (KEELE, RUB, PATRAS):.
E N D
Liverpool Keele Contribution
Task 1.4. Envelope information and binaural processing (KEELE, RUB, PATRAS): • KEELE will implement the use of envelope information (within & between channels) within an artificial system (CTK) and model its effects with respect to human listeners' data. They will study the effect of envelope information in other conditions and consider the respective contributions of envelope and other cues (pitch, ILD, ITD) to binaural processing, in conjunction with RUB and PATRAS.
Filter into bands • Present individual bands and combinations to listeners, measure intelligibility • Delay individual channels
Greenberg • Combinations of bands carry more information than linear sum of individual bands • Band 2 / Band 3 << 10 % intelligibility • Band 2 + band 3 >> 60 % intelligibility
Next steps • Modulation spectrum / phase as input • Modulation spectrum to get time invariance • LF phase info to model delay data • Expts running – more later…
Task 4.2 Informing Speech Recognition (KEELE, DCAG, IDIAP) • The aim is to combine classical and new noise estimation methods with a predictive element to allow the prediction and removal of time varying background noise. Novel noise estimation techniques will also be used to inform missing data techniques to obtain better recognition. In Blind Source Separation, the intention is to develop semi-blind algorithms which address the problems of echo compensation, noise reduction and de-reverberation.
Task 4.2 • Our Interest • CASA / AMaps etc good technique for noise estimation (and then spectral subtraction) • BUT: significant processing delays, • Interference with lip reading (100ms) • Interference with self monitoring (40ms) • Aim is to use prediction to compensate for processing delays • Prediction can also help bridge segments where CASA fails (fricatives etc)
Key Questions • 1) Can / Do we use a predictive element in speech perception (Elvira’s Talk) • 2) Is ‘noise’ predictable? • Matched training / testing in ASR • Noise adaptation in human listeners • First results:
noise type 63ms 125ms 250ms 625ms 1 car 3.03 3.16 3.20 3.62 2 station 3.04 3.78 4.07 5.13 3 street 3.86 4.60 5.28 7.15 4 exhibition 4.35 5.34 6.33 7.82 5 airport * 5.61 7.10 8.52 10.06 6 station * 6.77 9.06 11.26 12.48 Predicting noise • Use Linear Prediction to estimate noise spectrum 12.5ms into future. (single channel) • Noise attenuation – difference between subtracting estimate and long term avergae ‘history size’
Tasks • Collect database of environmental noises • Sennheiser in ear microphones / DAT tape(extract cct diags from Bochum again...) • Develop prediction algorithms • Multichannel Linear Prediction • Neural Network
Crouzet, O. & Ainsworth, W.A. (2001). Envelope information in speech processing: Acoustic-phonetic analysis vs. auditory figure-ground segregation.Proceedings of Eurospeech 2001 (ESCA 8th European Conference on Speech Communication and Technology), 3rd-7th September 2001, Aalborg, Scandinavia.
Crouzet, O. & Ainsworth, W.A. (2001). On the various influences of envelope information on the perception of speech in adverse conditions: An analysis of between-channel envelope correlation.CRAC Workshop (Consistent and Robust Acoustic Cues for sound analysis), 2nd September 2001, Aalborg, Scandinavia.