1 / 35

Obstruent Acoustics

Obstruent Acoustics. Bonus Learning Fun!. Motor Theory, in a nutshell. The big idea: We perceive speech as abstract “gestures”, not sounds. Evidence: The perceptual interpretation of speech differs radically from the acoustic organization of speech sounds Speech perception is multi-modal

bat
Download Presentation

Obstruent Acoustics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Obstruent Acoustics Bonus Learning Fun!

  2. Motor Theory, in a nutshell • The big idea: • We perceive speech as abstract “gestures”, not sounds. • Evidence: • The perceptual interpretation of speech differs radically from the acoustic organization of speech sounds • Speech perception is multi-modal • Direct (visual, tactile) information about gestures can influence/override indirect (acoustic) speech cues • Limited top-down access to the primary, acoustic elements of speech

  3. Moving On… • One important lesson to take from the motor theory perspective is: • The dynamics of speech are generally more important to perception than static acoustic cues. • Note: visual chimerism and March Madness.

  4. Auditory Chimeras • Speech waveform + music spectrum: frequency bands 1 2 4 8 16 32 • Music waveform + speech spectrum: frequency bands 1 2 4 8 16 32 Originals: Source: http://research.meei.harvard.edu/chimera/chimera_demos.html

  5. Auditory Chimeras • Speech1 waveform + speech2 spectrum: frequency bands 1 2 4 6 8 16 • Speech2 waveform + speech1 spectrum: frequency bands 1 2 4 6 8 16 Originals:

  6. Closure Voicing • The low frequency information that passes through the stop “filter” appears as a “voicing bar” in a spectrogram. • This acoustic information provides hardly any cues for place of articulation. Armenian: [bag]

  7. Stop Transition Cues (again) • With the transition between stop closure and vowel, the perceptual task becomes much easier: • Try the same with Peter’s productions: • stop closures: • with transitions: • The moral of the story (again): • Dynamic changes provide stronger perceptual cues to place than static acoustic information.

  8. Release Bursts • Note: along with transitions, stops have another cue for place at their disposal. • = release bursts • (nasals do not have these) • Here’s a waveform of a [p] release burst: duration  5 msec • What do you think the [p] burst spectrum will look like?

  9. Burst Spectrum • [p] bursts tend to have very diffuse spectra, with energy spread across a wide range of frequencies. • Also: [p] bursts are very weak in intensity. • Extremely short duration of bursts requires lots of damping in the waveform. •  broader frequency range

  10. Release Bursts • In a spectrogram: • bilabial release bursts have a very diffuse spectrum, weakly spread across all frequencies. [p] burst • [p] bursts are relatively close to pure transient sounds.

  11. Transients • A transient is: • “a sudden pressure fluctuation that is not sustained or repeated over time.” • An ideal transient waveform:

  12. A Transient Spectrum • An ideal transient spectrum is perfectly flat:

  13. Burst Filtering • The spectra of more posterior release bursts may be filtered by the cavity in front of the burst. • Ex: [t] bursts tend to lack energy at the lowest end of the frequency scale. • And higher frequency components are somewhat more intense. [t] burst

  14. Release Bursts: [k] • Velar release bursts are relatively intense. • They also often have a strong concentration of energy in the 1500-2000 Hz range (F2/F3). • There can often be multiple [k] release bursts. [k] burst

  15. Another Look • [k] bursts tend to be intense right where F2 and F3 meet in the velar pinch: Armenian: [bag]

  16. Finally, Fricatives • The last type of sound we need to consider in speech is an aperiodic, continuous noise. • (Transients are aperiodic but not continuous.) • Ideally: • Q: What would the spectrum of this waveform look like?

  17. White Noise Spectrum • Technical term: White noise • has an unlimited range of frequency components • Analogy: white light is what you get when you combine all visible frequencies of the electromagnetic spectrum

  18. Turbulence • We can create aperiodic noise in speech by taking advantage of the phenomenon of turbulence. • Some handy technical terms: • laminar flow: a fluid flowing in parallel layers, with no disruption between the layers. • turbulent flow: a fluid flowing with chaotic property changes, including rapid variation in pressure and velocity in both space and time • Whether or not airflow is turbulent depends on: • the volume velocity of the fluid • the area of the channel through which it flows

  19. Turbulence • Turbulence is more likely with: • a higher volume velocity • less channel area • All fricatives therefore require: • a narrow constriction • high airflow

  20. Fricative Specs • Fricatives require great articulatory precision. • Some data for [s] (Subtelny et al., 1972): • alveolar constriction  1 mm • incisor constriction  2-3 mm • Larger constrictions result in -like sounds. • Generally, fricatives have a cross-sectional area between 6 and 12 mm2. • Cross-sectional areas greater than 20 mm2 result in laminar flow. • Airflow = 330 cm3/sec for voiceless fricatives • …and 240 cm3/sec for voiced fricatives

  21. Turbulence Sources • For fricatives, turbulence is generated by forcing a stream of air at high velocity through either a narrow channel in the vocal tract or against an obstacle in the vocal tract. • Channel turbulence • produced when airflow escapes from a narrow channel and hits inert outside air • Obstacle turbulence • produced when airflow hits an obstacle in its path

  22. Channel vs. Obstacle • Almost all fricatives involve an obstacle of some sort. • General rule of thumb: obstacle turbulence is much noisier than channel turbulence • [f] vs. • Also: obstacle turbulence is louder, the more perpendicular the obstacle is to the airflow • [s] vs. [x] • [x] is a “wall fricative”

  23. Sibilants • Alveolar, dental and post-alveolar fricatives form a special class (the sibilants) because their obstacle is the back of the upper teeth. • This yields high intensity turbulence at high frequencies.

  24. vs. “shy” “thigh”

  25. Fricative Noise • Fricative noise has some inherent spectral shaping • …like “spectral tilt” • Note: this is a source characteristic • This resembles what is known as pink noise: • Compare with white noise:

  26. Fricative Shaping • The turbulence spectrum may be filtered by the resonating tube in front of the fricative. • (Due to narrowness of constriction, back cavity resonances don’t really show up.) • As usual, resonance is determined by length of the tube in front of the constriction. •  The longer the tube, the lower the “cut-off” frequency. • A basic example: • [s] vs.

  27. vs. [s] “sigh” “shy”

  28. Sampling Rates Revisited • Remember: Digital representations of speech can only capture frequency components up to half the sampling rate • the Nyquist frequency •  Speech should be sampled at at least 44100 Hz • (although there is little frequency information in speech above 10,000 Hz) • [s] has higher acoustic energy from about 3500 - 10000 Hz • Note: telephones sample at 8000 Hz • 44100 Hz • 8000 Hz

  29. Further Back • In more anterior fricatives, turbulence noise is generally shaped like a vowel made at the same place of articulation. [xoma] palatal vs. velar

  30. Even Further Back • Examples from Hebrew:

  31. At the Tail End • [h] exhibits a lot of coarticulation • [h] is not really a “fricative”; • it’s more like a whispered or breathy voiced vowel. “heed” “had”

  32. Aspirated Fricatives • Like stops, fricatives can be aspirated. • [h] follows the supraglottal frication in the vocal tract. • Examples from Chinese: [tsa] [tsha]

  33. Back at the Ranch • There is not much of a resonating filter in front of labial fricatives… • so their spectrum is flat and diffuse • (like bilabial stop release bursts) • Note: labio-dentals are more intense than bilabial fricatives • (channel vs. obstacle turbulence)

  34. Fricative Internal Cues • The articulatory precision required by fricatives means that they are less affected by context than stops. • It’s easy for listeners to distinguish between the various fricative places on the basis of the frication noise alone. • Result of both filter and source differences. • Examples: • There is, however, one exception to the rule…

  35. Huh? • The two most confusable consonants in the English language are [f] and . • (Interdentals also lack a resonating filter)

More Related