190 likes | 330 Views
Dialogue Act Tagging. Discourse and Dialogue CMSC 35900-1 November 4, 2004. Roadmap. Maptask overview Coding Transactions Games Moves Assessing agreement. Maptask. Conducted by HCRC – Edinburgh/Glasgow Task structure: 2 participants: Giver, follower 2 slightly different maps
E N D
Dialogue Act Tagging Discourse and Dialogue CMSC 35900-1 November 4, 2004
Roadmap • Maptask overview • Coding • Transactions • Games • Moves • Assessing agreement
Maptask • Conducted by HCRC – Edinburgh/Glasgow • Task structure: • 2 participants: Giver, follower • 2 slightly different maps • Giver guides follower to destination on own map • Forces interaction, ambiguities, disagreements, etc • Conditions: Familiar/not; Visible/not
Dialogue Tagging • Goal: Represent dialogue structure as generically as possible • Three level scheme: • Transactions • Major subtasks in participants overall task • Conversational Games • Correspond to G&S discourse segments • Conversational Moves • Initiation and response steps
Basic Dialogue Moves • Initiations and responses • Cover acts observed in dialogue – generalized Initiations: Instruct: tell to carry out some action; Explain: give unelicited information; Check: ask for confirmation; Align:check attention; Query-yn: Query-wh Responses:Acknowledge: signal understand & accept; Reply-y; Reply-n; Reply-wh; Clarify Ready:Inter-game moves
Game Coding • Initiation: • Identified by first move • Purpose – carry through to completion • May embed other games – Mark level • Mark completion/abandonment
Interrater Agreement • How good is tagging? A tagset? • Criterion: How accurate/consistent is it? • Stability: • Is the same rater self-consistent? • Reproducibility: • Do multiple annotators agree with each other? • Accuracy: • How well do coders agree with some “gold standard”?
Agreement Measure • Kippendorf’s Kappa (K) • Applies to classification into discrete categories • Corrects for chance agreement • K<0 : agree less than expected by chance • Quality intervals: • >= 0.8: Very good; 0.6<K<0.8: Good, etc • Maptask: K=0.92 on segmentation, • K = 0.83 on move labels
Dialogue Act Tagging • Other tagsets • DAMSL, SWBD-DAMSL, VERBMOBIL, etc • Many common move types • Vary in granularity • Number of moves, types • Assignment of multiple moves
Dialogue Act Recognition • Goal: Identify dialogue act tag(s) from surface form • Challenge: Surface form can be ambiguous • “Can you X?” – yes/no question, or info-request • “Flying on the 11th, at what time?” – check, statement • Requires interpretation by hearer • Strategies: Plan inference, cue recognition
Plan-inference-based • Classic AI (BDI) planning framework • Model Belief, Knowledge, Desire • Formal definition with predicate calculus • Axiomatization of plans and actions as well • STRIPS-style: Preconditions, Effects, Body • Rules for plan inference • Elegant, but.. • Labor-intensive rule, KB, heuristic development • Effectively AI-complete
Cue-based Interpretation • Employs sets of features to identify • Words and collocations: Please -> request • Prosody: Rising pitch -> yes/no question • Conversational structure: prior act • Example: Check: • Syntax: tag question “,right?” • Syntax + prosody: Fragment with rise • N-gram: argmax d P(d)P(W|d) • So you, sounds like, etc • Details later ….
Recognizing Maptask Acts • Assume: • Word-level transcription • Segmentation into utterances, • Ground truth DA tags • Goal: Train classifier for DA tagging • Exploit: • Lexical and prosodic cues • Sequential dependencies b/t Das • 14810 utts, 13 classes
Features for Classification • Acoustic-Prosodic Features: • Pitch, Energy, Duration, Speaking rate • Raw and normalized, whole utterance, last 300ms • 50 real-valued features • Text Features: • Count of Unigram, bi-gram, tri-grams • Appear multiple times • 10000 features, sparse • Features z-score normalized
Classification with SVMs • Support Vector Machines • Create n(n-1)/2 binary classifiers • Weight classes by inverse frequency • Learn weight vector and bias, classify by sign • Platt scaling to convert outputs to probabilities
Incorporating Sequential Constraints • Some sequences of DA tags more likely: • E.g. P(affirmative after y-n-Q) = 0.5 • P(affirmative after other) = 0.05 • Learn P(yi|yi-1) from corpus • Tag sequence probabilities • Platt-scaled SVM outputs are P(y|x) • Viterbi decoding to find optimal sequence
From Human to Computer • Conversational agents • Systems that (try to) participate in dialogues • Examples: Directory assistance, travel info, weather, restaurant and navigation info • Issues: • Limited understanding: ASR errors, interpretation • Computational costs: • broader coverage -> slower, less accurate
Dialogue Manager Tradeoffs • Flexibility vs Simplicity/Predictability • System vs User vs Mixed Initiative • Order of dialogue interaction • Conversational “naturalness” vs Accuracy • Cost of model construction, generalization, learning, etc • Models: FST, Frame-based, HMM, BDI • Evaluation frameworks