06-07-2010, University of Oldenburg, MEDI-AKU-SIGNAL Kolloquium

Signal Processing Algorithms for Wireless Acoustic Sensor NetworksAlexander BertrandElectrical Engineering Department (ESAT)Katholieke Universiteit Leuven 06-07-2010, University of Oldenburg, MEDI-AKU-SIGNAL Kolloquium

Outline Introduction Multi-channel Wiener filter (MWF) Example: distributed MWF in binaural hearing aids DANSE in fully connected WASN Tree-DANSE Multi-speaker VAD Noise reduction Tracking of speech power

Outline • Introduction • Multi-channel Wiener filter (MWF) • Example: distributed MWF in binaural hearing aids • DANSE in fully connected WASN • Tree-DANSE • Multi-speaker VAD

Traditional sensor array DSP  known / fixed sensor positions  Sharp angle  centralized processing  #microphones is limited  Long distance(SNR drops 6dB for each doubling of distance) Sensor array DSP 4

Distributed sensor arrays Wireless acoustic sensor network (WASN) • More spatial information • More sensors • Subset: high SNR recordings 5

Challenges Distributed sensor arrays 4) Subset selection 3) Distributed processing 1) Unknown/changing positions, link failure  ADAPTIVE 2) Bandwidth efficiency 6

Multi-channel Wiener Filtering (MWF) • Goal: estimate speech component in 1 of the N microphones • Output = sum of filtered microphone signals: W1 + W2 Clean speech W3 W4

Multi-channel Wiener Filtering (MWF) • Goal: estimate speech component in 1 of the N microphones • Output = sum of filtered microphone signals: • Needs: - N x N noise+speech correlation matrix Ryy - N x 1 clean speech correlation (column of Rdd) • Rddcan be estimated using Rdd= Ryy- Rnn using voice activity detection (VAD)mechanism W1 + W2 Clean speech W3 W4

Multi-channel Wiener Filtering (MWF) • RECAP • Given: N microphone signals • Choose one (arbitrary) reference microphone • MWF computes optimal filters such that sum of outputs is as close as possible to speech component in target microphone

Noise frame: destructive interference  Noise = electro music F1 + F2 F3 F4

Speech frame: constructive interference  Noise = electro music F1 + F2 F3 F4

Outline • Introduction • Multi-channel Wiener filter (MWF) • Example: distributed MWF in binaural hearing aids • DANSE in fully connected WASN • Tree-DANSE • Multi-speaker VAD • Subset selection • Conclusions

Example: binaural hearing aids large bandwidth needed full matrix inversion = 2-node WASN Binaural link MWF left MWF right 15

Example: binaural hearing aids + + Converges to optimum if single desired source (Doclo et al., 2007) Binaural link w11 g12 g21 w22 16

Motivation for DANSE • > 2 nodes ?e.g. supporting external sensor nodes or multiple hearing aid users. 17

Motivation for DANSE • > 2 nodes • Multiple desired sourcese.g. conversation monitoring. 21

Motivation for DANSE • > 2 nodes • Multiple desired sourcese.g. conversation monitoring. 22

DANSE • Previous requires more general framework:Distributed adaptive node-specific signal estimation (DANSE) • Allows for multiple nodes (fully connected topology) • Allows for multiple target sources: Estimating K sources requires communication of K-channel signals(DANSEK) 24

DANSE • Considered here: • Fully connected WSN • Multi-channel sensor signal observations • Goal: each node estimates node-specific signal, but common latent signal subspace (dimension= # targets)

3 nodes, fully connected 26

Binaural hearing aids (revisited) + + Binaural link w11 g12 g21 w22 27

Binaural hearing aids (revisited) + + Converges to optimum if #desired sources ≤ 2 auxiliary channels(capture signal space) J=2, DANSE2 (K=2) Binaural link w11(2) g12(2) g21(2) w22(2) w11(1) g12(1) g21(1) w22(1) 28

Binaural hearing aids (revisited) + + Converges to optimum if K= # desired sources J=2, DANSEK Binaural link 29

Sequential updating Sequential round-robin update

DANSE with simultaneous updating • Simultaneous updating: parallel computing • Sometimes convergence to optimal solution, but not always • Solution:relaxationyields convergence and optimality: 31

DANSE with simultaneous updating Without relaxation (S-DANSE) 4 nodes, 3-6 sensors/node 32

DANSE with simultaneous updating With relaxation (rS-DANSE) 4 nodes, 3-6 sensors/node 33

DANSE audio demo (tracking omitted) Unfiltered Centralized MWF rS-DANSE 34

Robust DANSE • Theory: DANSE == centralized MWF, but… 35

Robust DANSE • Numerical errors due to: • Estimation errors in Rdd (especially at low SNR nodes)  ripple effect • Reference microphones are close to each other ill-conditioned basis for signal subspace • Solution: estimate speech component in communicated signals, preferably from high SNR nodes (= Robust DANSE or R-DANSE) • Convergence is proven under certain dependency conditions 36

What if not fully connected?

What if not fully connected? Nodes must pass on information from other nodes 1) Nodes act as relays(virtually fully connected): - huge increase in bandwidth if limited connections - routing problem 2) Nodes broadcast the sum of all filtered inputs: - no increase in bandwidth - no routing problem (?)

What if not fully connected? 40

What if not fully connected? FEEDBACK !!

What if not fully connected? • Intuition • Theoretical analysis • Conclusion: feedback causes major problems • Direct feedback (one edge) vs. indirect feedback (loops)

Direct feedback cancellation • Transmitter feedback cancellation

Direct feedback cancellation • Receiver feedback cancellation

What if not fully connected? • Intuition • Theoretical analysis • Conclusion: feedback causes major problems • Direct feedback (one edge) vs. indirect feedback (loops) • Prune to tree topology  T-DANSE (= still optimal output!!)

Multi-speaker VAD speaker microphone - Goal: Track individual speech power of multiple simultaneous speakers or other non-stationary sources (VAD) - Exploit spatial diversity from WASN 47

Multi-speaker VAD WASN’s ! • Ad-hoc microphone array • Assumptions: • Speakers in near-field • Speakers are independent • Limited noise/reverberance • Sources to track are well-grounded (= they attain zero-values) • Advantages: • Array geometry unknown • Speaker positions unknown • Energy-based low data rate  synchronization not crucial 48

Data model

06-07-2010, University of Oldenburg, MEDI-AKU-SIGNAL Kolloquium