290 likes | 556 Views
Audiovisual Display and Audiovisual Recognition in Free Field Environments: Caves, Cars, and Critical Bands. Mark Hasegawa-Johnson January 29, 2003 Collaborators: Bowon Lee, Camille Goudeseune, Zhinian Jing, Danfeng Li, Thomas Huang, Stephen Levinson. What is “free-field audio?”.
E N D
Audiovisual Display and Audiovisual Recognition in Free FieldEnvironments: Caves, Cars, and Critical Bands Mark Hasegawa-Johnson January 29, 2003 Collaborators: Bowon Lee, Camille Goudeseune, Zhinian Jing, Danfeng Li, Thomas Huang, Stephen Levinson
Problems of free-field audio • CONTROL: acoustics of the room • CONTROL: where the user stands/sits • CONTROL: background noise
Topic #1: Audio Display for a Six-Walled Virtual Reality Theater (Beckman CUBE)
Solution: Regularized Semi-Inversion of Matrix Frequency Response
One More Problem: Image Source Method only accurate for t<100ms
Heuristic solution: Window the simulation with a decaying window
Audiovisual Speech Recognition: Audio and Visual Integration(Chu and Huang, 2002)
Recognition Features based on Auditory Scene Analysis1: Bandpass Filters on a Semilog Freq Scale (Simulate the Inner Ear)
Recognition Features based on Auditory Scene Analysis: 2. Correlogram
Recognition Features based on Auditory Scene Analysis: Correlogram Band Center Frequency Sub-band Pitch Period
Recognition Features based on Auditory Scene Analysis: 3. Periodicity-Weighted Spectrum(Jing and Hasegawa-Johnson, 2001)
Conclusions • Image source simulated room response • 12dB dereverberation, 15dB early echo suppression • Speech recognition in a car: Two Cameras, Two Microphones. • Speech Rec w/ Auditory Scene Analysis: error rate halved at 0dB.