1 / 18

Turn-taking in Mandarin Dialogue: Interactions of Tone and Intonation

Turn-taking in Mandarin Dialogue: Interactions of Tone and Intonation. Gina-Anne Levow University of Chicago October 14, 2005. Roadmap. Motivation Enabling fluent conversation Data Collection and Processing Acoustic Analysis of Turn-taking Tone and Intonation

jarvis
Download Presentation

Turn-taking in Mandarin Dialogue: Interactions of Tone and Intonation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Turn-taking in Mandarin Dialogue:Interactions of Tone and Intonation Gina-Anne Levow University of Chicago October 14, 2005

  2. Roadmap • Motivation • Enabling fluent conversation • Data Collection and Processing • Acoustic Analysis of Turn-taking • Tone and Intonation • Recognizing Boundaries and Interruptions • Conclusions and Future Work

  3. Turn-taking in Dialogue • Goal: Enable fluent conversation • Turn-taking is collaborative (Duncan 1974) • Requires producing and understanding cues • Crucial for dialogue agents and understanding • End-pointing in spoken dialogue systems • Confusion of barge-in and backchannel

  4. Challenges • Silence not sufficient or necessary • Dialogue involves overlap • Overlaps are not arbitrary (Ward et al, 2000) • Proposed cues: • Multimodal: Gesture, Gaze • Not always available • Prosodic • Attested in English, Japanese • Tone languages?

  5. Approach • Identify significant differences in • Pitch, intensity between initial/final positions • Intensity for different transition types • Pitch, intensity of interruptions vs smooth • Assess interaction of tone and intonation • Exploit contrasts for recognition of • Turn unit boundaries: ~93% • Interruptions: 62%

  6. Data Collection • Taiwanese Putonghua Corpus • 5 spontaneous dialogues • ~20 minutes each • 7 female, 3 male speakers • Manually transcribed and word segmented • Turn beginnings and overlaps • Manually labelled and time-stamped

  7. Data Processing • Automatic forced alignment • CU Sonic (Pellom et al) language porting • Dictionary-based, manual pinyin-ARPABET mapping • Yields phone, syllable, word, silence duration, position • Acoustic analysis • Pitch, Intensity: Praat (Boersma, 2001) • Per-side log-scaled z-score normalized

  8. Turn Unit Types

  9. Turn Unit Initial-Final Contrasts

  10. Turn Unit Boundary Contrasts • Unit initial versus final syllables • Pitch significantly lower in final than initial • Intensity significantly lower in final than initial • Across all transition types • Rough versus smooth transitions • Final syllables • Intensity significantly higher

  11. Characterizing Interruptions • Contrast first syllable of “inter” vs “smooth” • Pitch significantly higher in interruptions • Intensity significantly higher in interruptions

  12. Interactions of Tone and Intonation • Clear intonational cues in tone language • What affect on tones? • Contrast tones in final vs non-final position • Mean pitch lowered in each tone • Relative height largely preserved • Contour lowered but largely preserved • Distinguishing tone characteristics retained

  13. Interactions of Tone and Intonation • Mean pitch across tones • Tone contour changes

  14. Recognizing Turn Unit Boundaries and Turn Types • Classifier – Boostexter (Schapire 2000); 10-fold xval • Comparable results for C4.5, SVMs • Prosodic features: • Local: • Pitch, Intensity: Mean, Max; Duration • Word, syllable • Contextual: • Difference b/t current and following word: pitch, int • Silence • Text features: • N-grams within preceding, following 5 syllables

  15. Recognizing Turn Unit Boundaries • Word: Boundary/non-boundary • 3200 instances; down-sampled, balanced set • Key features: Silence, max intensity • Lexical features: preceding ‘ta’, following ‘dui’ • Prosodic features more robust without silence

  16. Recognizing Interruptions • Initial words: Interruption/smooth start • >400 instances: downsampled, balanced set • Contextual features: • Difference of current word pitch, intensity w/ prev • Preceding silence • Best results: 62%, all feature sets • Key feature: silence • Without silence drops to chance

  17. Discussion • Turn-taking in Mandarin Dialogue • Significant intonational, prosodic cues • Initiation/Finality: Lower final pitch, intensity • Turn transition types: • Rough vs smooth: higher final intensity • Interruptions vs smooth: higher pitch, intensity • Tones globally lowered; shape, relative height • Exploit cues for boundary, interruption • 93%, 62% respectively – with silence

  18. Conclusions & Future Work • Intonational cues to turn-taking in Mandarin • Pitch jointly encodes lexical, dialogue meaning • Basic tone contrasts largely preserved • Prosodic information supports dialogue flow • Silence important, but other cues co-signal • Integrate dialogue information for tone reco • Turn-taking, topic structure, etc

More Related