A unification of adaptive multi-microphone noise reduction systems

A unification of adaptive multi-microphone noise reduction systems Ann Spriet1,2, Simon Doclo1, Marc Moonen1, Jan Wouters2 1Dept. of Electrical Engineering (ESAT-SCD), KU Leuven, Belgium 2ExpORL, Dept. Neurosciences, KU Leuven, Belgium IWAENC-2006, 12.09.2006

Overview • Signal-dependent multi-microphone noise reduction • LCMV, TF-LCMV, SDW-MWF, soft-constrained beamformer • Definition of general cost function • Derivation of existing + novel algorithms • Theoretical work, no focus on implementation issues and simulation results

transfer functions TF-ratio Objectives: 1. noise reduction: 2. limit speech distortion: • Reference signal : source , microphone , output FB S Multi-microphone noise reduction • Signal model: • Output signal: • Multi-microphonenoise reduction • General cost function • Conclusions single speech source

with p position of source with respect to microphone array • Distributed speech source (region P): [Yermeche 2004] Multi-microphone noise reduction • Second-order statistics: speech and noise correlation matrix • TFs sometimes approximated bysteering vector using free-field propagation model (delay, near-field, microphone chars) • Multi-microphonenoise reduction • General cost function • Conclusions

[Frost 1972, Griffiths-Jim 1982, Buckley 1986] [Gannot 2001, Hoshuyama 1999, Herbordt 2003] LCMV TF-LCMV • minimize output noise energy • hard constraint: no distortion • on-line model: estimate TF-ratios • minimize output noise energy • hard constraint: no distortion • a-priori model (free-field) soft vs. hard constraint [Nordholm, Dam, Grbíc, Low, 2002-2005] soft-constrained beamforming [Doclo 2002,Spriet 2004] SDW-MWF • soft constraint: trade-off output noise energy vs. speech distortion • on-line speech model • soft constraint: trade-off output noise energy vs. speech distortion • a-priori model (speech region) a-priori model vs. on-line estimation Multi-microphone noise reduction • Multi-microphonenoise reduction • General cost function • Conclusions

output noise energy speech distortion (on-line) speech distortion (a-priori) General cost function • Trade-off output noise energy and speech distortion • on-line estimation • based on a-priori knowledge (model, calibration) • Multi-microphonenoise reduction • General cost function • Conclusions • Different signal-dependent algorithms: • on-line estimation vs. a-priori knowledge • hard constraint (12=)  signals in speech subspace undistorted, noise suppresion in subspace orthogonal to speech subspace vs. soft-contraint (12)  spectral filtering of desired speech • On-line noise estimation (=0)  focus on speech model (1,2)

General cost function • Multi-microphonenoise reduction • General cost function • Conclusions

LCMV (2=): hard constraint • speech source using free-field progation model: • reference signal: • solution: • Soft-constrained beamforming (2): • soft constraint on (partially) modelled speech distortion term: model for spatial characteristics, on-line estimation of spectrum • speech source in region P: • reference signal: • solution: on-line a-priori A-priori speech model (1=0) • assumptions about speaker location, acoustics, microphones performance affected when assumptions are violated • Multi-microphonenoise reduction • General cost function - a-priori model - on-line model - combined model • Conclusions

TF-LCMV (1=): hard constraint • output speech component = speech component microphone signal • on-line estimate of TF-ratio using non-stationarity of speech • adaptive implementation: TF-GSC, adaptive blocking matrix [Gannot 2001] On-line speech model (2=0) • reference signal = microphone signal • typically requires VAD + noise more stationary than speech performance affected by VAD errors and non-stationary noise • Multi-microphonenoise reduction • General cost function - a-priori model - on-line model - combined model • Conclusions

On-line speech model (2=0) • reference signal = microphone signal • typically requires VAD + noise more stationary than speech performance affected by VAD errors and non-stationary noise • SDW-MWF (1): • soft constraint on speech distortion • on-line estimate of speech correlation matrix using VAD • SDW-MWF = TF-LCMV + single-channel postfilter  spectral filtering • Multi-microphonenoise reduction • General cost function - a-priori model - on-line model - combined model • Conclusions

SDR-GSC (2=) • combination of LCMV beamformer and SDW-MWF • hard constraint imposed through GSC-structure (FB + BM) • soft constraint: on-line estimated speech distortion between speech component in speech reference and output signal [Spriet 2004] Combined speech model • Combination of a-priori knowledge and on-line estimation  increase robustness against estimation errors • Multi-microphonenoise reduction • General cost function - a-priori model - on-line model - combined model • Conclusions • Soft-constrained SDW-MWF (2): • combination of SDW-MWF and soft-constrained beamformer • speech model: • partially updated based on incoming data • partially computed a-priori using model or calibration data

Conclusions • General cost function: • output noise energy + speech distortion • on-line estimated vs. a-priori model • speech distortion: hard vs. soft constraint • Derivation of signal-dependent multi-microphone noise reduction algorithms: • LCMV, TF-LCMV, SDW-MWF, soft-constrained beamformer, SDR-GSC, soft-constrained SDW-MWF • Extensions and combinations: • a-priori noise model: fixed beamforming • combination of on-line and a-priori noise model: e.g. sensitivity-constrained GSC • several other possibilities possible! • Combination of a-priori knowledge and on-line estimation of both speech and noise terms anticipated to enhance robustness • Multi-microphonenoise reduction • General cost function • Conclusions

A unification of adaptive multi-microphone noise reduction systems