270 likes | 882 Views
Back Channel Communication. Antoine Raux Dialogs on Dialogs 02/25/2005. Outline. From Back Channel to backchannels Function of the Back Channel Characteristics of the Back Channel The Back Channel in Spoken Dialogue Systems. From back channel….
E N D
Back Channel Communication Antoine Raux Dialogs on Dialogs 02/25/2005
Outline • From Back Channel to backchannels • Function of the Back Channel • Characteristics of the Back Channel • The Back Channel in Spoken Dialogue Systems
From back channel… • 70s: Conversation Analysts attempt to describe systematic rules for turn-taking management • Goal: minimize gaps and overlaps between speakers • BUT many overlaps in natural speech • E.g.: “mm-hmm”, “okay”, “yeah”… • “Back channel” (Yngve 1970): Parallel channel for communication (Duncan 1972) • “Back channel communication does not constitute a turn or a claim for a turn” • But it “may participate in a variety of communication functions, including the regulation of speaking turns.”
…to backchannels • “Backchannel”: listener-produced signal such as “mm-hmm”, “yeah”…(“To backchannel”: to produce such signals) • Does not imply the will to take the turn • Implies some form of acknowledgment (in general)
Front-channel cues to back-channel signals • Koiso et al (1998) • Analyze the relationship between different syntactic and prosodic features and the occurrence of backchannels
Koiso et al (Methodology) • Data: 8 dialogs from Japanese Map Task corpus: • replica of the Edinburgh MT • Face-to-face and speech only (no difference) • Features • Syntactic: POS • Duration of last mora (normal/long/short) • F0 pattern of last mora (flat-fall, rise…) • Peak F0 (low/high) • Energy pattern (late-decr, decr, no-decr) • Peak energy (low/high)
Koiso et al (Results) • Frequency of feature values
Koiso et al (Results) • Decision Tree analysis • Compare the loss in performance by not using each feature • POS: single best feature • Prosodic features altogether: as good as POS
Koiso et al (Discussion) • Some POS strongly inhibit BC • Individual prosodic features are not good indicators of BC occurrence • BC occurrence is conditioned by both POS and prosody (as a whole) • What about other languages? • What about BC overlapping with speech?
BC cues in English and Japanese • Ward and Tsukahara (2000) • Tests one hypothesis (“BC are triggered by low pitch cues”) for two languages
The Low Pitch Cue • Both in American English and Japanese, it appears that “after a region of low pitch lasting 110 ms the listener tends to produce back-channel feedback”. • Goal of this paper: quantitatively test this on naturally occurring conversations
Ward and Tsukahara (Methodology) • Data: • English: 8 conversations, 12 speakers (first author participates in 5 conversations!) • Japanese: 18 conversations, 24 speakers • Prediction: • Every 10ms decide BC/no-BC by applying a hand coded rule with 5 parameters tuned to the data
Ward and Tsukahara (Results) • Each predicted BC was considered correct if it fell within 500ms of an actual BC • Low pitch region rule is better than chance both in English and Japanese
Ward and Tsukahara (Results) • Issues: • Evaluation (tolerance window size, speakers produce BCs with different frequencies…) • No actual comparison between languages • Are low pitch regions and BCs simply correlated to other phenomena (syntactic completion, disfluencies…) or is there a direct cause/consequence relationship?
Effects of Native Language and Gender on BC • Feke (2003) • Conversation Analysis study of BC in native-English and native-Spanish, same- and mixed-gender dialogs
Definition of BC • BC: responses of the participant that is “clearly not holding the floor”… • Very loose compared to previous papers: • e.g. “How did you find Quechua?” is a BC • Distinguishes In-Between BC and Overlap BC
Feke (Methodology) • Recorded 8 non-scripted conversations between 8 different speakers (2 native languages x 2 genders x 2 subjects) • Manually coded In-Between BCs and Overlap BCs
Feke (Results) • No differences observed across cultures • Participants of both genders tend to use more BC when conversing with someone of the opposite gender • Difference seems bigger for females than for males
Feke (Discussion) • Interesting/surprising result from the ethnological/sociological point of view • Very few data points, no significance analysis • Only looked at number of BCs • Consequences on SDS? (e.g. using gender information in BC prediction, selecting the gender of an agent…)
BC in Practical Systems… • Takeuchi et al (2003) • Method to determine the timing of turn transitions and aizuchi (≈BC) on Japanese Human-Human corpus
Takeuchi (Approach) • Similar to Koiso et al, but only using automatically extracted features • Every 100 ms decide between: • Take turn • Aizuchi (BC) • Leave turn (wait)
Takeuchi (Approach) • Decision Tree using • Syntax (POS, content/function words) • Utterance duration • Pause duration/pause since last content wd • Content word duration • F0 • Power
Takeuchi (Results) • Precision/Recall of frame classification: • Around 80% on the training set • Less then 50% on a test set • Subjective evaluation: • Artificially insert BC at predicted time • Timing was judged “good” in 70-80% • On real utterances: 72% (!)
Takeuchi (Discussion) • Found that syntactic information did not help (contradicts Koiso?) • Underscores the difficulty of evaluating turn-taking/backchanneling systems
Conclusion • Hard to account for simultaneous turns in conversation • Back Channel framework offers one explanation • But most work remains very specific • Missing a good theory of conversation…