120 likes | 242 Views
Voice quality and F0 cues for affect expression. By I. Yanushevskaya , C. Gobl and N. Chasaide. Outline. Introduction Synthetic stimuli Experiment setup Result Conclusion. Introduction. F0 cues are crucial for emotional speech What about Voice Quality ? Base on previous works:
E N D
Voice quality and F0 cues for affect expression By I. Yanushevskaya, C. Gobl and N. Chasaide
Outline • Introduction • Synthetic stimuli • Experiment setup • Result • Conclusion
Introduction • F0 cues are crucial for emotional speech • What about Voice Quality? • Base on previous works: • Adding voice quality cues enhance speech synthesis • Several voice quality stimuli have similar result: • Tense ~= Harsh • Breathy ~= whisper • Varying voice quality can influence listener’s judgment • Want to know the effect of varying voice quality only.
Synthetic stimuli • 15 synthetic stimuli: Jaadjö (Hello Goodbye) • KLSYN88 as formant synthesizer • 3 groups stimuli: “VQ”, “F0”, “VQ+F0”
VQ only stimuli • Modal, breathy, whispery, lax-creaky, tense stimuli • Omit harsh, creaky included in previous work • Modal: Copy the natural utterance to KLSYN88 • Breathy: lower AV, higher OQ, lower SQ, higher TL, wider B1 • Whispery: Aspiration noise • Lax-creaky: Creaky+Breathy-Whispery • Tense: lower OQ, higher SQ, lower TL, narrower B1 higher F0 • NOT normalized with F0
VQ+F0 stimuli Are these good pairs? We’ll see….
Experiment setup • 20 native speakers • 10 of 15 stimuli presented • Response a pair of opposite affective attribute • sad-happy • Intimate-formal • Relaxed-stressed • Bored-interested • Apologetic-indignant • Fearless-scared • ANOVA
Conclusion • Showed that some voice quality is more related than other in some emotions. • X Intimacy, sadness -> breathy • O -> lax-creaky • Voice quality is averagely better than F0 cues on speech synthesis • Maybe because the voice quality already includes the information of F0