370 likes | 524 Views
Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonology. Mark Hasegawa-Johnson jhasegaw@uiuc.edu University of Illinois at Urbana-Champaign, USA. Lecture 2: Acoustics of Vowel and Glide Production.
E N D
Landmark-Based Speech Recognition:Spectrogram Reading,Support Vector Machines,Dynamic Bayesian Networks,and Phonology Mark Hasegawa-Johnson jhasegaw@uiuc.edu University of Illinois at Urbana-Champaign, USA
Lecture 2: Acoustics of Vowel and Glide Production • One-Dimensional Linear Acoustics • The Acoustic Wave Equation • Transmission Lines • Standing Wave Patterns • One-Tube Models • Schwa • Front cavity resonance of fricatives • Two-Tube Models • The vowel /a/ • Helmholtz Resonator • The vowels /u,i,e/ • Perturbation Theory • The vowels /u/, /o/ revisited • Glides
Standing Wave Patterns: Quarter-Wave Resonators Tube Closed at the Left End, Open at the Right End
Standing Wave Patterns: Half-Wave Resonators Tube Closed at Both Ends Tube Open at Both Ends
Schwa and Invv (the vowels in “a tug”) F3=2500Hz=5c/4L F2=1500Hz=3c/4L F1=500Hz=c/4L
Front Cavity Resonances of a Fricative /s/: Front Cavity Resonance = 4500Hz 4500Hz = c/4L if Front Cavity Length is L=1.9cm /sh/: Front Cavity Resonance = 2200Hz 2200Hz = c/4L if Front Cavity Length is L=4.0cm
Conservation of Mass at the Juncture of Two Tubes U2(x,t)= 2U1(x,t) U1(x,t) A2 = A1/2 A1 Total liters/second transmitted = (velocity) X (tube area)
Two-Tube Model: Two Different Sets of Waves Incident Wave P1+ Reflected Wave P2+ Reflected Wave P1- Incident Wave P2-
Approximate Solution of the Two-Tube Model, A1>>A2 LBACK LFRONT Approximate solution: Assume that the two tubes are completely decoupled, so that the formants include - F(BACK CAVITY) = c/4 LBACK - F(FRONT CAVITY) = c/4LFRONT
The Vowels /AA/, /AH/ LBACK LFRONT LBACK=8.8cm F2= c/4LBACK = 1000Hz LFRONT=12.6cm F1= c/4LFRONT = 700Hz
Acoustic Impedance Z(x,jW) x 0 Z(x,jW) x 0
Helmholtz Resonator -Z1(x,jW) = Z2(x,jW) x 0 x 0
The Vowel /i/ Back Cavity = Pharynx Resonances: 0Hz, 2000Hz, 4000Hz Front Cavity = Palatal Constriction Resonances: 0Hz, 2500Hz, 5000Hz Back Cavity Volume = 70cm3 Front Cavity Length/Area = 7cm-1 1/2p√MC = 250Hz Helmholtz Resonance replaces all 0Hz partial-tube resonances. 2500Hz 2000Hz 250Hz
The Vowel /u/: A Two-Tube Model 2000Hz 1000Hz 250Hz Back Cavity = Mouth + Pharynx Resonances: 0Hz, 1000Hz, 2000Hz Front Cavity = Lips Resonances: 0Hz, 18000Hz, … Back Cavity Volume = 200cm3 Front Cavity Length/Area = 2cm-1 1/2p√MC = 250Hz Helmholtz Resonance replaces all 0Hz partial-tube resonances.
The Vowel /u/: A Four-Tube Model Velar Tongue Body Constriction Lips Pharynx Mouth Two Helmholtz Resonators = Two Low-Frequency Formants! F1 = 250Hz F2 = 500Hz F3 = Pharynx resonance, c/2L = 2000Hz 2000Hz 500Hz 250Hz
Perturbation Theory(Chiba and Kajiyama, The Vowel, 1940) A(x) is constant everywhere, except for one small perturbation. Method: 1. Compute formants of the “unperturbed” vocal tract. 2. Perturb the formant frequencies to match the area perturbation.
Sensitivity Functions for the Quarter-Wave Resonator (Lips Open) 0 x L /AA/ /ER/ /IY/ /W/
Sensitivity Functions for the Half-Wave Resonator (Lips Rounded) 0 x L /L,OW/ /UW/
Formant Frequencies of Vowels From Peterson & Barney, 1952
Summary • Acoustic wave equation easiest to solve in frequency domain, for example: • Solve two boundary condition equations for P+ and P-, or • Solve the two-tube model (four equations in four unknowns) • Quarter-Wave Resonator: Open at one end, Closed at the other • Schwa or Invv (“a tug”) • Front cavity resonance of a fricative or stop • Half-Wave Resonator: Closed at the glottis, Nearly closed at the lips • /uw/ • Two-Tube Models • Exact solution: use reflection coefficient • Approximate solution: decouple the tubes, solve separately • Helmholtz Resonator • When the two-tube model seems to have resonances at 0Hz, use, instead, the Helmholtz Resonance frequency, computed with low-frequency approximations of acoustic impedance • /iy/: F1 is a Helmholtz Resonance • /uw/ and /ow/: Both F1 and F2 are Helmholtz Resonances • Perturbation Theory • Perturbed area Perturbed formants • Sensitivity function explains most vowels and glides in one simple chart