130 likes | 232 Views
Assessment of Vocal Noise via Bi-directional Long-term Linear Prediction of Running Speech. F. Bettens * , F. Grenez * , J. Schoentgen *,** * Université Libre de Bruxelles ** National Fund for Scientific Research Belgium. Existing Cues of Vocal Noise.
E N D
Assessment of Vocal Noise via Bi-directional Long-term Linear Prediction of Running Speech F. Bettens*, F. Grenez*, J. Schoentgen*,** *Université Libre de Bruxelles **National Fund for Scientific Research Belgium
Existing Cues of Vocal Noise • Detection of individual vocal cycles(or harmonics) • Steady vowel fragments • (Pseudo)-Periodicity • Period Perturbation Quotient • Amplitude Perturbation Quotient • Harmonics-to-Noise Ratio
Objectives : Analyses of Dysperiodicities • Give up request that speech fragments are : • (Pseudo)-Periodic • Steady • Any Speech Fragment : • Modal Voices & (Very) Hoarse Voices • Sustained Vowels & Running Speech
Motivation : Analysis of Running Speech • Voicing in running speech • Variable acoustic impedance • Voicing onsets & offsets • Variable pressure drops • Variable laryngeal positions • Voice Loading
Double Linear Predictive Analysis • Conventional short-term linear prediction: • Long-term linear prediction: remove existing correlations unpredictable noise component (Qi, 1999) forward short-term prediction error forward double prediction error
Double Linear Predictive Analysis Solutions: remove short-term linear predictive analysis stage proceed to bi-directional analysis Drawbacks: • eS[n] is an artificial signal • the dysperiodicities in weighted sum x[n] are omitted • eL[n] is inflated to the right of unvoiced/voiced boundaries
Bi-directional Long-term Prediction • Forward long-term linear prediction: • Backward long-term linear prediction: • Bi-directional long-term linear prediction: keep the “best” (frame by frame) forward long-term prediction error backward long-term prediction error bi-directional long-term prediction error
Long-term Prediction Distance : P Maximum of the auto-correlation function example: steady vowel [a] (dysphonic speaker) P = 184 (2 cycles)
Vocal Noise Cue Signal-to-Dysperiodicity Ratio: example: steady vowel [a] healthy speaker dysphonic speaker speech signal x[n] bi-directional long-term prediction error eL[n] SDR = 31,2 dB SDR = 10,1 dB
Results1:Sentence(1 female speaker; modal phonation type)(http://www.limsi.fr/VOQUAL/ : “Il est sorti avant le jour”) segments [il] speech signal bi-directional long-term prediction error forward long-term prediction error
Results 2 : Sentence (1 female speaker; 5 phonation types)(http://www.limsi.fr/VOQUAL/ : “Il est sorti avant le jour”)
Conclusion The forward & backward long-term prediction of speech enables the analysis of any speech signal with a view to the assessment of the vocal noise (i.e. vocal dysperiodicities) The analysis is not based on any assumptions regarding the periodicity or stationarity of the speech signals