Exploring Acoustic Waves in Small and Large Spaces: A Journey through Speech Technology

Next talk focuses on the nature of the signal: • Acoustic waves in small spaces (sources) • Acoustic waves in large spaces (rooms) • So far: • Historical overview of speech technology basic components/goals for systems • Quick review of DSP fundamentals • Quick overview of pattern recognition basics

Acoustic waves - a brief intro • A way to bridge from thinking about EE to thinking about acoustics: • Acoustic signals are like electrical ones, only much slower … • Pressure is like voltage • Volume velocity is like current(and impedance = Pressure/velocity) • For wave solutions, c is a lot smaller • To analyze, look at constrained models of common structures: strings and tubes

x + dx + =

is the wave equation for transverse vibration on a string • So 2y 2y = c2 x2 t2 Where c can be derived from the properties of the medium, and is the wave propagation speed

Solutions dependent on boundary conditions • Assume form f(t - x/c) for positive x direction • Then f(t + x/c) for negative x direction • Sum is A f(t - x/c) + B f(t +x/c)

Excitation Open end x 0 L Uniform tube, source on one end, open on the other

Plane wave propagation for frequencies below ~4000 Hz c = f

By looking at the solutions to this equation, we can show that c is the speed of sound

2 2 = .. t + +

+ - e jt - + + + + - Let u+(t - x/c) = A e j(t - x/c) and u-(t + x/c) = B e j(t + x/c) u(0,t) = ej t = A e j(t - 0/c) - B e j(t + 0/c)

u(0,t) = ej t = A e j(t - 0/c) - B e j(t + 0/c) Problem: Find A and B to match boundary conditions Solve for A and B (eliminate t) • Now you can get equation 10.24 in text, for excitation U() ej t : p(L,t) = 0 = A e j(t - L/c) + B e j(t + L/c) (upcoming homework problem) u(x,t) = cos [(L-x)/c] U() ej t cos [(L)/c] Poles occur when: f = (2n + 1)c/4L  = (2n + 1)πc/2L

c = 340 m/s L = 17cm 4L = .68 m f1

First 3 modes of an acoustic tube open at one end

Effect of losses in the tube • Upward shift in lower resonances • Poles no longer on unit circle - peak values in frequency response are finite

Effect of nonuniformities in the tube • Impedance mismatches cause reflections • Can be modeled as a succession of smaller tubes • Resonances move around - hence the different formants for different speech sounds

Acoustic reverberation • Reflection vs absorption at room surfaces • Effects tend to be more important than room modes for speech intelligibility • Also very important for musical clarity, tone

(uniformly distributed and diffuse) = = 4 + + =

Decay of intensity when source is shut off (W=0) = - =

= = - = =

The phrase “two oh six” convolved with impulse response from .5 second RT60 room

Initial time delay gap = t0

Measuring room responses • Impulsive sounds • Correlation of mic input with random signal source (since R(x,y) = R(x,x) * h(t) ) • Chirp input • Also includes mic, speaker responses • No single room response (also not really linear)

Effects of reverb • Increases loudness • “Early” loudness increase helps intelligibility • “Late” loudness increase hurts intelligibility • When noise is present, ill effects compounded • Even worse for machine algorithms

Dealing with reverb • Microphone arrays - beamforming • Reducing effects by subtraction/filtering • Stereo mic transfer function • Using robust features (for ASR especially) • Statistical adaptation

Artificial reverberation • Physical devices (springs, plate, etc.) • Simple electronic delay with feedback • FIR for early delays (think of “initial time delay gap” in concert halls), IIR for later decay • Explicit convolution with stored response

Exploring Acoustic Waves in Small and Large Spaces: A Journey through Speech Technology

Exploring Acoustic Waves in Small and Large Spaces: A Journey through Speech Technology

Presentation Transcript