420 likes | 600 Views
Digital Audio Signal Processing Lecture 6 : Reverberation & Dereverberation. Toon van Waterschoot / Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven toon.vanwaterschoot@esat.kuleuven.be marc.moonen@esat.kuleuven.be. Outline. Introduction P roblem statement A pplication scenarios
E N D
Digital Audio Signal ProcessingLecture 6: Reverberation & Dereverberation Toon van Waterschoot / Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven toon.vanwaterschoot@esat.kuleuven.be marc.moonen@esat.kuleuven.be
Outline • Introduction • Problem statement • Application scenarios • Room acoustics • Dereverberation • Method 1: Beamforming • Method 2: Speech enhancement • Method 3: Blind system identification & inversion • Conclusion & open issues
Introduction: Problem statement • Clean sound > Room acoustics > Reverberant sound • desired: music example [clean ] [reverberant ] • undesired: speech example [clean ] [reverberant ] [very reverberant ] • Reverberation has desired/undesired impact on sound quality and speech intelligibility • Research problems: • artificial reverberation synthesis • reverberation control/enhancement • dereverberation
Introduction: Application scenarios • Scenario-1: Sound reproduction • goal: sound control in acoustic environment (improved listening comfort/experience for audience) • preprocessing strategy • single-point > multiple-point > area (increasingly difficult) • applications: public address, home/automotive audio systems preprocessing • Note: in a sound reproduction scenario, • dereverberation is often referred to as equalization
Introduction: Application scenarios • Scenario-2: Sound acquisition • goal: sound control in electric environment (improved sound quality of microphone recordings) • postprocessing strategy • single-microphone > multi-microphone • applications: speech recognition, hearing aids, recording, … postprocessing • Note: in contrast to AEC/AFC problems, (de)reverberation problem is not related to concurrent use of loudspeakers and microphones in same acoustic environment
Outline • Introduction • Room acoustics • Dereverberation • Method 1: Beamforming • Method 2: Speech enhancement • Method 3: Blind system identification & inversion • Conclusion & open issues
Room acoustics: Overview • Acoustic waves • Key characteristics • Non-parametric models • Finite difference method • Finite/boundary element method • Image source method • Ray tracing method • Parametric models • (Digital waveguide mesh) • Impulse response • Room transfer function • Pole-zero model
Room acoustics: Acoustic waves • Acoustic wave equation a valid sound field always satisties • = sound pressure (function of space and time) • speed of sound • is ‘Laplacian’ operator(carthesian coordinates) • subject to boundary conditions example ‘rigid wall’: • single point source:
Room acoustics: Acoustic waves • Acoustic wave equation > Helmholtz equation • obtained from acoustic wave equation by applying a Fourier transform over the time variable (*) • k is wave number • compose sound field as sum of “room modes” Example: 2-D room, 6 x 10 m rigid walls mode 1: 17.1 Hz =0.5*(343m/s)/(10m) mode 2: 28.5 Hz =0.5*(343m/s)/(6m) mode 3 (1&2): 33.3 Hz =sqrt((17.1)^2+(28.5)^2) mode 4: 34.3 Hz =(343m/s)/(10m) mode 5 (2&4): 44.6 Hz =sqrt((17.1)^2+(28.5)^2) mode 1 mode 3 mode 4 mode 5 mode 2
Room acoustics: Key characteristics • Reverberation time (Sabine’s formula): • room volume, total surface area of room • average absorption coefficient of surfaces (*) • time needed for 60 dB squared sound pressure decay • Critical distance: • source directivity • room constant • distance at which direct = reverberant sound energy • Direct-to-reverberant ratio: • source-observer distance • ratio of direct vs. reverberant sound energy (*) 0≤α≤1, 0 for ‘rigid wall’ (‘mirror’), 1 for ‘open window’
Room acoustics: Non-parametric models (1) • Finite difference time domain (FDTD) method • spatio-temporal sampling on regular grid: • partial derivatives (spatial & temporal) in wave equation approximated by finite difference operator • FDTD wave equation • with boundary conditions…
Room acoustics: Non-parametric models (2) • Finite element method (FEM) • 4-step procedure to discretize boundary value problem • weak formulation of boundary value problem • integration by parts to relax differentiability requirements • subspace approximation of field and source functions • enforce orthogonality of approximation error to subspace • subspace approximation relies on FEM basis functions: • defined on arbitrarily constructed tetrahedral mesh • having small spatial support • FEM wave equation: • Boundary element method (BEM) • numerical approximation of Green’s function Skip this part Skip this part
Room acoustics: Non-parametric models (3) • Ray tracing method • sound waves represented by “rays” • assumption of specular reflections (no diffraction), i.e. mirror-like reflection in which ray from a single incoming direction is reflected into a single outgoing direction • rays can be traced from sound source to observer
Room acoustics: Non-parametric models (4) • Image source method • reflections modeled as direct rays from “image source” • image sources = virtual sources located outside room • multiple reflections modeled as high-order image sources
Room acoustics: Parametric models (1) • Impulse response • room response to “gunshot” source (impulse function) • conceptually simple model, straightforward interpretation • poor modeling efficiency (~103params), high spatial variation diffuse sound field direct coupling early reflections
Room acoustics: Parametric models (2) • Room transfer function (RTF) • assumptions: shoe-box shaped room / rigid walls • “assumed modes” solution of Helmholtz equation: • = set of (non-negligible) room modes • resonance frequency of m-th mode • damping factor of m-th mode • eigenfunction of m-th mode • normalization constant of m-th mode
Room acoustics: Parametric models (3) • Pole-zero model • RTF suggests use of pole-zero model • RTF denominator independent of source/observer positions • gain factor • minimum-phase zeros • non-minimum-phase zeros • “common acoustical poles” • special cases: • all-zero model = impulse response • all-pole model: represents room resonances only
Outline • Introduction • Room acoustics • Dereverberation • Problem statement • Overview of dereverberation methods • Method 1: Beamforming • Method 2: Speech enhancement • Method 3: Blind system identification & inversion • Conclusion & open issues
Dereverberation: problem & overview • PS: measurement noise not considered: • Reverberation as an additive signal degradation • Method 1: beamforming approach to dereverberation spatial separation of clean and reverberant sound • Method 2: speech enhancement approach to dereverberation transform-domain separation of clean and reverberant sound • Reverberation as a convolutive signal degradation • Method 3: blind system identification and inversion approach to dereverberation: deconvolution of reverberant sound
Outline • Introduction • Room acoustics • Dereverberation • Method 1: Beamforming • fixed beamforming • adaptive beamforming • Method 2: Speech enhancement • Method 3: Blind system identification & inversion • Conclusion & open issues
Method 1: Introduction • concept: spatial separation of direct and reverberant sound (cf. multi-microphone noise reduction) • difficulties compared to noise reduction: • spatial separation of direct sound and room reflections requires knowledge of reflection DOAs (~ room acoustics model) • reverberant sound is diffuse (comes from "all possible" directions, including source direction) • two distinct approaches: • fixed delay-and-sum beamformer • adaptive filter-and-sum beamformer
Method 1: Fixed DSB (cfr. Lecture-2) • fixed DSB structure (cf. Topic-2): • fixed DSB = matched filter (maximizing WNG) in the case • spatially white noise (not entirely true for reverberation!) • known sound source position • ideal omni-directional microphones
Method 1: Fixed DSB • expected DRR improvement of fixed DSB: • source to m-th microphone distance, • wave number • m-th microphone position vector • computed using “statistical room acoustics (SRA)” (with assumption that direct & (diffuse) reverberant component are uncorrelated, etc.) • depends on source-array distance + microphone separation • independent of reverberation time (!) (cfr ‘improvement’ of DRR)
Method 1: Adaptive FSB (cfr. Lecture-2) • adaptive FSB structure (cf. Topic-2): • optimal solution (matched filter) depends on room model: ~ blind system identification & inversion (cf. below) + :
Outline • Introduction • Room acoustics • Dereverberation • Method 1: Beamforming • Method 2: Speech enhancement • cepstrum-based • LPC-based • spectrum-based • Method 3: Blind system identification & inversion • Conclusion & open issues
Method 2: Introduction • concept: enhancement of reverberant speech by modeling & reducing reverberant sound in transform domain • applicable to single- & multi-microphone sound acquisition • choice of transform domain results in three approaches: • cepstrum-based • LPC-based • spectrum-based
Method 2: Cepstrum-based • concept: • convolution in time domain ~ addition in cepstral(*) domain • reverberation can be subtracted in cepstral domain • cepstral subtraction: • speech = low-quefrency • room acoustics = high-quefrency cepstral synthesis cepstral analysis cepstral subtraction (*) use complex cepstrum (=invertible)
Method 2: LPC-based • linear predictive coding of reverberant speech: • reverberation hardly affects speech LPC coefficients • reverberation largely affects LPC residual • dereverberation reduces to LPC residual enhancement • based on knowledge of speech production process + spatial averaging (using multiple microphones) LPC analysis LPC residual enhancement LPC synthesis LPC coefficients
Method 3: Spectrum-based • concept: late reverberation ~ (broadband) additive noise • spectral subtraction: • estimate “noise” energy & compute subtractive gain function • spectral subtraction assumes noise stationarity (cf. Lecture-3) not valid for reverberation! • estimation of "noise“ energy based on statistical model for late reverberation TF analysis Spectral subtraction TF synthesis • Note: • Straightforwardly extendable to combined dereverberation & noise suppression late reverberation energy estimator
Outline • Introduction • Room acoustics • Dereverberation • Method 1: Beamforming • Method 2: Speech enhancement • Method 3: Blind system identification & inversion • all-zero model identification & inversion • all-pole model identification & inversion • Conclusion & open issues
Method 3: Introduction • concept: two-step procedure • step 1: identify room model (source > multiple microphones) • step 2: invert room model • highly non-trivial – difficulties: • source signal unknown > blind identification • (non-) invertibility of room model • model inversion sensitive to identification & numerical errors • two approaches based on different room models: • all-zero model • all-pole model
Method 3: Blind system identification • starting point: cross-relation error / nullifying filters • batch identification using EVD/SVD • vector of stacked & filtered RIRs lies in null space of microphone array covariance matrix • filters denote “erroneous zeros” (which can be removed) • zeros common to all RIRs cannot be identified • high & unknown RIR order / poor conditioning
Method 3: Blind system identification • PS: “vector of stacked & filtered RIRs lies in null space of microphone array covariance matrix “
Method 3: Blind system identification • PS: “zeros common C(z) to all RIRs cannot be identified” S(z) S’(z) C(z)
Method 3: Inversion • Multiple-input/output inverse theorem (MINT): • exact solution exists if • poor conditioning for near-common zeros • Inversion sensitive to system identification errors
Method 3: Inversion • Multiple-input/output inverse theorem (MINT): • exact solution exists if • poor conditioning for near-common zeros • Inversion sensitive to system identification errors
pre-echo Method 3: Inversion • matched filtering: • can be interpreted as multiple-beam beamformers, having beams in direction of direct sound and 1st order reflections (note that has a peak at time = 0, corresponding to a constructive addition of all multi-path components) • matched filter = non-causal filter > pre-echo effect
pre-echo Method 3: Inversion • matched filtering: • can be interpreted as multiple-beam beamformers, having beams in direction of direct sound and 1st order reflections (note that has a peak at time = 0, corresponding to a constructive addition of all multi-path components) • matched filter = non-causal filter > pre-echo effect (can be alleviated by filter truncation)
Method 3: All-pole model • starting point: all-pole model with common acoustical poles • a priori identification of all-pole model • multi-channel LPC of estimated RIRs • spatial averaging of single-channel LPC coefficients • model inversion > fixed FIR filter (!)
Conclusion • reverberation is complex physical phenomenon that can be modeled in a variety of ways • research problems related to reverberation: • artificial reverberation synthesis • reverberation control/enhancement • dereverberation • dereverberation is still challenging problem! • Method 1: beamforming • Method 2: speech enhancement • Method 3: blind system identification & inversion