1 / 49

Learning Objectives

Learning Objectives. Describe how speakers control frequency and amplitude of vocal fold vibration Describe psychophysical attributes of pitch , loudness and quality in physiological and acoustic terms

nubia
Download Presentation

Learning Objectives

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Objectives • Describe how speakers control frequency and amplitude of vocal fold vibration • Describe psychophysical attributes of pitch, loudness and quality in physiological and acoustic terms • Define terms such as speaking fundamental frequency, speaking fundamental frequency variability, harmonics (or signal) to noise ratio, jitter, shimmer, cepstrum, quefrency, and rahmonic amplitude

  2. What is the difference between pitch and frequency?

  3. Quantifying frequency • Hertz: cycles per second (Hz) Non-linear scales • Octave scale • 1/3 octave bands • Semitones • Cents • Other “auditory scales”: e.g. mel, phon

  4. Fundamental Frequency (F0)Control What factors dictate the vibratory frequency of the vocal folds? • Anatomical factors Males ↑ VF mass and length = ↓ Fo Females ↓ VF mass and length = ↑ Fo • Subglottal pressure adjustment – show example ↑ Psg = ↑ Fo • Laryngeal and vocal fold adjustments ↑ CT activity = ↑ Fo TA activity = ↑ Fo or ↓ Fo • Extralaryngeal adjustments ↑ height of larynx = ↑ Fo

  5. Average F0 speaking fundamental frequency (SFF) Correlate of pitch Infants ~350-500 Hz Boys & girls (3-10) ~ 270-300 Hz Young adult females ~ 220 Hz Young adult males ~ 120 Hz Older females: F0↓ Older males: F0↑ F0 variability F0 varies due to Syllabic & emphatic stress Syntactic and semantic factors Phonetics factors (in some languages) Provides a melody (prosody) Measures F0 Standard deviation ~2-4 semitones for normal speakers F0 Range maximum F0 – minimum F0 within a speaking task Characterizing Fundamental Frequency (F0)

  6. Estimating the limits of vocal fold vibration Maximum Phonational Frequency Range • highest possible F0 - lowest possible F0 • Not a speech measure • measured in Hz, semitones or octaves • Males ~ 80-700 Hz1 • Females ~135-1000 Hz1 • Around a 3 octave range is often considered “normal” 1Baken (1987)

  7. Approaches to Measuring Fundamental Frequency (F0) • Time domain vs. frequency domain • Manual vs. automated measurement • Specific Approaches • Peak picking • Zero crossing • Autocorrelation • The cepstrum & cepstral analysis

  8. Autocorrelation Data Correlation + 1.0 + 0.1 - 0.82 + 0.92

  9. What is a cepstrum? • A cepstrum involves performing a spectral analysis of an amplitude spectrum • Returns sound representation to a “time-like” domain analysis: quefrency-domain • Location of the dominant energy in the cepstrum is typically associated with the fundamental frequency of the signal

  10. What is a cepstrum? Time Domain (waveform) Sound Pressure Time Fourier Transform Frequency Domain (amplitude spectrum) Amplitude Frequency

  11. What is a cepstrum? Frequency Domain (amplitude spectrum) Amplitude Frequency

  12. What is a cepstrum? Fourier Transform (number 2) Dominant rahmonic -quefrency location: fundamental period -height: degree of periodicity Quefrency (msec)

  13. Learning Objectives • Describe how speakers control frequency and amplitude of vocal fold vibration • Describe psychophysical attributes of pitch, loudness and quality in physiological and acoustic terms • Explain what the decibel is and why it is a preferred way to quantify amplitude

  14. What is the difference between amplitude and loudness?

  15. Quantifying amplitude Sound pressure level • Pressure = force/area • Units: micropascals Sound intensity • Intensity = Power/area where • power=work/time • work=force*distance • Units: watts/m2 Intensity is proportionate to Pressure2

  16. What is the decibel scale? • We prefer to use the decibel scale to represent signal amplitude • We are used to using measurement scales that are absolute and linear • The decibel scale is relative and logarithmic

  17. Linear vs. logarithmic • Linear scale: 1,2,3… • For example, the difference between 2 and 4 is the same as the difference between 8 and 10. • We say these are additive

  18. Linear vs. logarithmic • Logarithmic scales are multiplicative • Recall from high school math and hearing science 10 = 101 = 10 x 1 100 = 102 = 10 x 10 1000= 103 = 10 x 10 x 10 0.1 = 10-1 = 1/10 x 1 Logarithmic scales use the exponents for the number scale log1010 = 1 log10100 = 2 log 101000=3 log 100.1 = -1

  19. Logarithmic Scale • base doesn’t have to be 10 • In computer science, base = 2 • In the natural sciences, the base is often 2.7… or e

  20. Logarithmic Scale • Why use such a complicated scale? • logarithmic scale squeezes a very wide range of magnitudes into a relatively compact scale • this is roughly how our hearing works in that a logarithmic scales matches our perception of loudness change

  21. Absolute vs. relative measurement • Relative measures are a ratio of a measure to some reference • Relative scales can be referenced to anything you want. • decibel scale doesn’t measure amplitude (intensity or pressure) absolutely, but as a ratio of some reference value.

  22. Typical reference values • Intensity • 10-12 watts/m2 • Sound Pressure Level (SPL) • 20 micropascals Why do we use these particular values?

  23. However… • You can reference intensity/pressure to anything you want For example, • Post therapy to pre therapy • Sick people to healthy people • Sound A to sound B

  24. Now, let us combine the idea of logarithmic and relative… bel= log 10(Im/ Ir) Im –measured intensity Ir – reference intensity A bel is pretty big, so we tend to use decibel where deci is 1/10. So 10 decibels makes one bel dBIL = 10log 10(Im/ Ir)

  25. Intensity vs. Pressure • Intensity is trickier to measure. • Pressure is easy to measure – a microphone is a pressure measuring device. • Intensity is proportionate to Pressure2

  26. Extending the formula to pressure Using some logrithmic tricks, this translates our equation for the decibel to dBSPL= (2)(10)log 10(Pm/ Pr) = 20log 10(Pm/ Pr)

  27. Amplitude control during speech • Subglottal pressure adjustment ↑ Psg = ↑ sound pressure • Laryngeal and vocal fold adjustments ↑ medial compression = ↑ sound pressure • Supralaryngeal adjustments • Optimizing sound radiation from vocal tract

  28. Average SPL Correlate of loudness conversation: ~ 65-80 dBSPL SPL Variability  SPL to mark stress Contributes to prosody Measure Standard deviation for neutral reading material: ~ 10 dBSPL Sound Pressure Level (SPL)

  29. Estimating the limits of sound pressure generation Dynamic Range • Amplitude analogue to maximum phonational frequency range • ~50 – 115 dB SPL

  30. Learning Objectives • Describe psychophysical attributes of pitch, loudness and qualityin physiological and acoustic terms • Define terms such as speaking fundamental frequency, speaking fundamental frequency variability, harmonics (or signal) to noise ratio, jitter, shimmer, cepstrum, quefrency, and rahmonic amplitude

  31. no clear acoustic correlates like pitch and loudness However, terms have invaded our vocabulary that suggest distinct categories of voice quality Common Terms Breathy Tense/strained Rough Hoarse Vocal Quality

  32. Are there features in the acoustic signal that correlate with these quality descriptors?

  33. Breathiness Perceptual Description • Audible air escape in the voice Physiologic Factors • Diminished or absent closed phase • Increased airflow Potential Acoustic Consequences • Change in harmonic (periodic) energy • Sharper harmonic roll off • Change in aperiodic energy • Increased level of aperiodic energy (i.e. noise), particularly in the high frequencies

  34. harmonics (signal)-to-noise-ratio (SNR/HNR) • harmonic/noise amplitude •  HNR • Relatively more signal • Indicative of a normality •  HNR • Relatively more noise • Indicative of disorder • Normative values depend on method of calculation • “normal” HNR ~ 15

  35. Harmonic peak Noise ‘floor’ Amplitude Harmonic peak Noise ‘floor’ Frequency

  36. First harmonic amplitude From Hillenbrand et al. (1996)

  37. Prominent Cepstral Peak

  38. Spectral Tilt: Voice Source

  39. Spectral Tilt: Radiated Sound

  40. Peak/average amplitude ratio

  41. From Hillenbrand et al. (1996)

  42. WMU Graduate Students

  43. Tense/Pressed/Effortful/Strained Voice Perceptual Description • Sense of effort in production Physiologic Factors • Longer closed phase • Reduced airflow Potential Acoustic consequences • Change in harmonic (periodic) energy • Flatter harmonic roll off

  44. Spectral Tilt Pressed Breathy

  45. Acoustic Basis of Vocal Effort Perception of Effort F0 + RMS + Open Quotient Tasko, Parker & Hillenbrand (2008)

  46. Roughness • Perceptual Description • Perceived cycle-to-cycle variability in voice • Physiologic Factors • Vocal folds vibrate, but in an irregular way • Potential Acoustic Consequences • Cycle-to-cycle variations F0 and amplitude • Elevated jitter • Elevated shimmer

  47. Period/frequency & amplitude variability • Jitter: variability in the period of each successive cycle of vibration • Shimmer: variability in the amplitude of each successive cycle of vibration …

  48. Sources of jitter and shimmer Small structural asymmetries of vocal folds “material” on the vocal folds (e.g. mucus) Biomechanical events, such as raising/lowering the larynx in the neck Small variations in tracheal pressures “Bodily” events – system noise Measuring jitter and shimmer Variability in measurement approaches Variability in how measures are reported Jitter Typically reported as % or msec Normal ~ 0.2 - 1% Shimmer Can be % or dB Norms not well established Jitter and Shimmer

More Related