410 likes | 427 Views
Research focusing on physiology, acoustics, and perception of voice source for speech synthesis, speech recognition, and diagnostics of voice pathologies. Methods include acoustic measurements, vocal intensity range, and vocal registers analysis.
E N D
Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University
Voice research To describe and model the properties of the vocal sound source from view points of: • Physiology • Acoustics • Perception Voice Source Characterization
Importance of the voice • Speech synthesis • Towards natural sounding synthesis • Speech recognition • Using source properties in recognition • Speaker recognition/identification • Voice source characteristics are essential • Diagnosis • Pathologies, voice classifications Voice Source Characterization
Voice possibilities Limited use of voice in speech • Range of the fundamental frequency • Vocal intensity range • Spectral variation Voice Source Characterization
Focus in this presentation How do acoustic voice source characteristics vary as a functionof F0 and vocal intensity Voice Source Characterization
Voice profile measurement Thirties: Intensity range as function of various pitches • manual measurement Eighties: Automatic computation ofF0 and Intensity • computer measurement • visual feedback • additional parameters Voice Source Characterization
Measurement unit • One decibel • One semi-tone Voice Source Characterization
Measurement procedure • Subject in front of computer screen • Microphone on head set (30 cm) • Just phonate, sing, and see the result immediately • Best results with recording protocol • Feed back stimulates extreme phonations Voice Source Characterization
Voice profile / density Vocal Intensity (dB SPL) Sample density Fundamental frequency (Hz) Voice Source Characterization
Voice profile / speech area Vocal Intensity (dB SPL) Sample density Fundamental frequency (Hz) Voice Source Characterization
Acoustic voice quality parameters • Jitter • Stability of periodicity • Asymmetry in vocal folds • Crest factor • Max amplitude divided by average energy • Relates to spectral slope • Many more … Voice Source Characterization
Crest factor Vocal Intensity (dB SPL) Crest factor Fundamental frequency (Hz) Voice Source Characterization
Real time presentation Screen presentation • One data point per F0-I cell Advanced data storage [new] • Full audio signal • Full distribution of data per F0-I cell • Data for screen presentation Voice Source Characterization
Advantages • Reusability of recordings • Statistical analysis per F0-I cell • Study of time-varying behavior Voice Source Characterization
Crest factor Vocal Intensity (dB SPL) Crest factor Fundamental frequency (Hz) Voice Source Characterization
Median smoothing of crest factor Crest factor median smoothed Vocal Intensity (dB SPL) Crest factor Fundamental frequency (Hz) Voice Source Characterization
Vocal Registers Different movement patterns of the vocal folds • Pulse register (creaky voice) • Modal register • Falsetto register Voice Source Characterization
Pulse register • Less than 50 Hz • Irregular • Long closed period Voice Source Characterization
Pulse register Vocal Intensity (dB SPL) Fundamental Frequency (Hz) Voice Source Characterization
Modal register • “Normal” use of voice • Active role of M. Vocalis • Vocal folds thick and completely vibrating • Wide range in F0 and intensity • Flat spectrum Voice Source Characterization
Modal register Vocal Intensity (dB SPL) Fundamental frequency (Hz) Voice Source Characterization
Falsetto register • Higher pitches • M. Vocalis passive, tense vocal ligaments through M.Cricothyroidus • Edge vibration of vocal volds • Sound poor in higher harmonics (in untrained subjects) Voice Source Characterization
Falsetto register Vocal Intensity (dB SPL) Fundamental frequency (Hz) Voice Source Characterization
Register overlap Vocal Inensity (dB SPL) Fundamental frequency (Hz) Voice Source Characterization
Chest- en head voice Refer to secundary vibratory sensations in the body • Chest voice: loud modal register • Head voice: • males: higher, softer modal register in overlap area with falsetto register • women: falsetto register Voice Source Characterization
Chest voice and Head voice chest Vocal Intensity (dB SPL) head Fundamental frequency (Hz) Voice Source Characterization
Registers and voice profiles With a description using • Iso-crest factor lines • Iso-jitter lines Voice Source Characterization
Iso-crest factor lines 6 dB Vocal Intensity (dB SPL) Crest factor 4 dB Fundamental frequency (Hz) Voice Source Characterization
Iso-jitter lines Vocal Intensity (dB SPL) Jitter (%) 3 % Fundamental frequency (Hz) Voice Source Characterization
New representation • Areas defined by iso-parameter lines • crest factor < 4 dB • crest factor > 4 dB, < 6 dB • crest factor > 6 dB • jitter < 3 % • [relative rise time < 6 %] Voice Source Characterization
Areas in the phonetogram RRT < 6 % pressed-like Crest factor < 4 dB sine-like Vocal Intensity (dB SPL) Jitter > 3%, unstable Fundamental frequency (Hz) Voice Source Characterization
Vocal registers in the phonetogram Chest voice boundary Vocal Intensity (dB SPL) Falsetto upper boundary Modal lower boundary Fundamental frequency (Hz) Voice Source Characterization
Comparison of voice profiles Characterisation of • Voice pathologies • Voice classifications Reuse stored voice profiles of subjects with known voice history Voice Source Characterization
Important features • Contour has limited value • but most research goes into that direction (norm profiles) • Distribution of acoustical parameters across the voice profile tells much more Voice Source Characterization
We need • Unit for comparison • Voice profile unit defined by small range of F0 and Vocal Intensity • Distributions of acoustic voice parameters per unit • Probability density function per parameter • Model • Hidden Markov Model Voice Source Characterization
Unit model • two unconnected states per phonetogram unit • vocal registers • start and end of phonetion Voice Source Characterization
Correspondences • Speech Voice Profile • phoneme model F0/I unit model • not labeled labeled by F0 and I • spectral envelope acoustic voice parameters • language model unrestricted transitions • “forced alignment • recognition” Voice Source Characterization
Crest factor distributions Voice Source Characterization
Most distinctive states Vocal Intensity (dB SPL) Distinctiveness Fundamental frequency (Hz) Voice Source Characterization
Conclusions • Voice profiles can enhance our understanding of vocal behaviour in a visually attractive way • Current data storage opens a series of important research topics • Market opportunities for “light” versions Voice Source Characterization