Van den Bogaert T., Doclo S., Moonen M. and Wouters J. ICASSP 2007

Combining noise reduction and binaural cue preservation for hearing aids:MWF-ITFMultichannel Wiener Filter with Interaural Transfer Function Van den Bogaert T., Doclo S., Moonen M. and Wouters J. ICASSP 2007 download at https://gilbert.med.kuleuven.be/~u0041407/2007ICASSP.pdf Contact: Tim.vandenbogaert@med.kuleuven.ac.be

Overview • Problem statement: binaural hearing aids, noise reduction and preservance of binaural cues • Multichannel Wiener Filter approaches: • MWF: a standard N-microphone Multi-channel Wiener Filter approach • MWF-ITF: Extension of MWF to an MWF with integrated Interaural Transfer Function. • Experimental results: SRT and Localization • Objective measures • Perceptual measures (N=5) ICASSP 2007 Van den Bogaert et al.

Problem statement • Hearing impairment  reduction of speech intelligibility in background noise (even with amplification) • Signal processing to selectively enhance useful speech signal • Multiple microphones available: spectral + spatial processing • Most hearing impaired are fitted with hearing aids at both ears • Binaural hearing: everything relating to hearing simultaneously (vs bilateral) with two ears • Binaural cues, in addition to spectral and temporal cues, play an important role in binaural noise reduction and sound localization. (important to preserve these cues) • It has been reported that current bilateral noise reduction systems have a negative impact on binaural hearing[Van den Bogaert et al. 2005, Van den Bogaert et al. 2006] ICASSP 2007 Van den Bogaert et al.

Problem statement Bilateral system Binaural system More microphones = more performance? If adaptive typically no control on left/right proc. + - more control on left/right processing? + Hard to preserve interaural cues (mic mismatch, imperfections, ...) - Need of binaural link - ICASSP 2007 Van den Bogaert et al.

ILD Signal source IPD/ITD Problem statement • Main binaural cues • Interaural phase or interaural time differences • ITD range: from 10µs to 700µs • Of the signal f<1300Hz and/or on the low frequency envelope for complex sounds • Interaural level differences • ILD range: from 1dB up to more than 30dB • Physically significant f>2000Hz ICASSP 2007 Van den Bogaert et al.

Problem statement • Design criteria binaural noise reduction for HA’s: • Maximize noise reduction by using all available microphone • signals (binaural link assumed), 2 outputs needed • Preserve the binaural cues • - Limit the amount of speech distortion • (HA constraints: Robustness of the system, low complexity, …) ICASSP 2007 Van den Bogaert et al.

Extension of MWF : preservation of binaural speech and noise cues without substantially compromising noise reduction performance Overview of binaural noise reduction techniques • Different approaches • BSS methods (Robert Aichner ICASSP 3days ago) • Fixed beamforming[e.g. Desloge 1997] • Low complexity • Limited performance, only speech cues may be preserved (in ideal situations) • CASA based techniques[e.g. Wittkop 2003] • Perfect preservation of speech/noise cues • Mostly for 2 microphones, “spectral substraction” like problems • Adaptive beamforming, based on GSC structure passing the low freq part of the signal unproc. [e.g. Welker 1997] • Preserves parts of the binaural cues • Substantial drop in noise reduction • Binaural multi-channel wiener filter[e.g. Doclo 2002 Spriet 2004] • Speech cues are preserved • No assumptions about positions of sources and microphones • Noise cues may be distorted ICASSP 2007 Van den Bogaert et al.

2M microphones: Speaker e s i o N Speech component Noise component w w ( ) ( ) Y Y 1 0 Filtered output: w w ( ) ( ) W W 1 0 Goal: to estimate the speech component at the reference microphone of each hearing aid (r0, r1) typically the front omnidirectional one: w w ( ) ( ) Z Z 1 0 Standard multichannel Wiener filter Multichannel Wiener Filter A hearing aid listening scenario ICASSP 2007 Van den Bogaert et al.

Multichannel Wiener Filter To control or reduce speech distortion rewrite cost function: Introduce trade off parameter noise reduction/speech distortion Speech distortion Trade off parameter Noise reduction Speech distortion weighted multichannel Wiener filter In standard hearing aid beamforming, avoiding speech distortion is typically done by calibrating the speech reference path and removing the speech component in the noise ref path ICASSP 2007 Van den Bogaert et al.

Add term related to binaural cues of noise component to the MWF cost function • Possible cues: ITD, ILD, Interaural Transfer Function (ITF) ITF-MWF Multichannel Wiener Filter Estimate, f(VAD) • Depends on second-order statistics of speech and noise, no assumption of speech and noise source (can be integrated in VAD) • Perfectly preserves the interaural cues of the speech component since in the left and right hearing aid an estimate is made of the speech component in the front microphone of this hearing aids. • Shifts the interaural cues of the noise component to the cues of the speech component !!!! ICASSP 2007 Van den Bogaert et al.

Multichannel Wiener Filter – ITF extension Goal: the ITF of the noise component at the output = ITF at the input Under assumption of a single noise source You can do this for the speech and noise component Performance and influence of beta and alpha on Loc and SNR improvement performance? ICASSP 2007 Van den Bogaert et al.

Experimental results • Identification of HRTFs: • Binaural recordings on CORTEX MK2 artificial head • 2 omni-directional microphones on each hearing aid (d=1cm) • Hrtfs measured = -90:15:90, 90:30:270, 1m from head • Conditions: T60=140 ms (T60=590 ms added) fs=16 kHz, =1 • Objective evaluation: • AI weighted SNR improvement • ITD and ILD error • Perceptual evaluation: • Headphone exp with record. hrtfs • SRT measurements (50% Sp. Intell) • Localization using prerecorded hrtfs, S and N components are send seperately through the fixed filter, localize S and N in the room were the hrtfs were recorded ICASSP 2007 Van den Bogaert et al.

Left input signals Right input signals VAD FFT FFT Off-line computation of statistics Calculate filters for this specific sc. Frequency-domain filtering IFFT IFFT Left output Right output Experimental results Stored filters are converged for a condition Sx Ny with Sx=speech weighted noise from angle x and Ny=babble noise from angle y ICASSP 2007 Van den Bogaert et al.

Experimental results: objective evaluation • Error measures correlated with design criteria: • Maximize speech intelligibility: Intelligibility weighted SNR improvement (left/right) • Minimize interaural cue distortion • ILD of speech and noise component • ITD of speech and noise component importance of i-th frequency bin for speech intelligibility low-pass filter 1500 Hz ICASSP 2007 Van den Bogaert et al.

objective evaluation: localization S0N60 ICASSP 2007 Van den Bogaert et al.

objective evaluation: SRT S0N60 importance of i-th frequency for speech intelligibility α β T60=0.14s Additions: For T60=0.59s Left perf. S0N60 drops to 5dB SNR AI, right perf drops to 7dB SNR AI noise reduction Going from 2 to 4 microphones gives a gain of about 3 dB SNR AI to 9dB SNR AI compared to 2 microphone performance β α T60=0.14s ICASSP 2007 Van den Bogaert et al.

SRT: perceptual evaluation • Adaptive SRT procedure to find 50% Speech Recept Threshold • S0N60, dutch VU sentences, T60=140ms • average SRT without processing = -9.2 dB • SRT improvements in the range 11-13 dB • Binaural speech intelligibility advantage because of spatial seperation speech • and noise component does not seem to compensate for loss in SNR improvement • Addition: performance drops to around 6dB SRT gain if T60=590ms (S0N90 tested for N=2) ICASSP 2007 Van den Bogaert et al.

Localization: perceptual evaluation • Condition SxN0: Speech arrives from angle x, with x from -90° till +90° in steps of 30°, noise arrives from 0 degrees. • Perceptual procedure: calculate MWF filters offline trained on spatial condition SxNy. Now run a telephone ring arriving from angle x and angle y seperately through the filters and store the result. Play these wav files under headphones to the subject and ask to localize the telephone signal. ICASSP 2007 Van den Bogaert et al.

Localization: perceptual evaluation SxN0 Localization of Sx Localization of N0 alfa=0 beta=0 alfa=0 beta=10 ICASSP 2007 Van den Bogaert et al.

Loc error Sx, 5 subjects Loc error N0, 5 subjects 1 0 0 80 9 0 70 8 0 60 7 0 50 6 0 ) ° ° ( ( r r 5 0 o o 40 r r r r e e 4 0 30 3 0 1 0 20 2 0 0 10 0 - 9 0 - 6 0 - 3 0 0 3 0 6 0 9 0 -90 -60 -30 0 30 60 90 x ( ° ) x(°) alpha=0 alpha=0,5 80 70 80 60 70 50 60 40 ) ° 50 ( 30 ) ° 40 ( 20 30 10 20 0 0 0,1 0,3 1 10 100 10 -10 beta 0 0 0,1 0,3 1 10 100 beta Loc error N0 in SxN0 Loc error Sx in SxN0 Localization: perceptual evaluation SxN0 ICASSP 2007 Van den Bogaert et al.

Localization: perceptual evaluation • Sum of localisation errors Sx and N0 • Parameters can be tuned to achieve better overal localization performance -> at the cost of some noise reduction • There is a correlation between physical and perceptual evaluation, even for localization. However error measures far from perfect. (do not include diffuseness, …) ICASSP 2007 Van den Bogaert et al.

Conclusions • (Speech distortion weighted) MWF preserves the speech cues, not the noise cues • MWF-ITF enables, by constraining the filters W to an area where noise ITF is preserved, a trade off between preservance of speech and noise cues and noise reduction performance (a solution but not the perfect solution: multiple spectral overlapping noise sources, ...) • Preserving localization cues did not show a large benefit (due to the spatial seperation of speech and noise) / reduction (due to the extra constraints set on W) in SRT score. ICASSP 2007 Van den Bogaert et al.

Acknowledgements download at https://gilbert.med.kuleuven.be/~u0041407/2007ICASSP.pdf Contact: Tim.vandenbogaert@med.kuleuven.ac.be ICASSP 2007 Van den Bogaert et al.

objective evaluation: SRT (additions) • The gain of going from 2 mics on one HA to 3 or 4 mics (low reverb): • Single Noise Scenario: • 2 mic performance: 5 to 19 dB SNR AI improvement • 2 to 3 mics: +2/+5 dB SNR AI improvement • 3 to 4 mics: +1/+4 dB SNR AI improvement • (max if noise source is at position of 2 mics -> adding a good SNR signal as 3rd or 4th microphone) • Multiple noise sources (3 noise sources) • 2 mic performance: around 7 dB SNR AI improvement • 2 to 3 mics: +2dB SNR AI improvement • 3 to 4 mics: +2dB SNR AI improvement ICASSP 2007 Van den Bogaert et al.

a a ITD error speech (VU_man + auditec 0deg, = 0, SNR=0dB) ITD error noise (VU_man + auditec 0deg, = 0, SNR=0dB) 60 60 beta = 0 beta = 0.1 50 50 beta = 0.3 beta = 1 beta = 10 40 40 beta = 100 ITD error [%] ITD error [%] 30 30 20 20 10 10 0 0 -80 -60 -40 -20 0 20 40 60 80 -80 -60 -40 -20 0 20 40 60 80 Angle speech source Angle speech source Localization: objective evaluation SxN0 • large  changes direction of speech component to noise component  increase weight  (cf. physical and perceptual evaluation) ICASSP 2007 Van den Bogaert et al.

Van den Bogaert T., Doclo S., Moonen M. and Wouters J. ICASSP 2007