210 likes | 290 Views
You talking to me? A Corpus and Algorithm for Conversation Disentanglement. Micha Elsner and Eugene Charniak Brown University ACL 2008. Research Objective. Conversation Disentanglement. Research Objective. (Chanel) Felicia: google works :)
E N D
You talking to me? A Corpus and Algorithm for Conversation Disentanglement Micha Elsner and Eugene Charniak Brown University ACL 2008
Research Objective • Conversation Disentanglement
Research Objective (Chanel) Felicia: google works :) (Gale) Arlie: you guys have never worked in a factory before have you (Gale) Arlie: there’s some real unethical stuff that goes on (Regine) hands Chanel a trophy (Arlie) Gale, of course ... thats how they make money (Gale) and people lose limbs or get killed (Felicia) excellent
Research Objective (Chanel) Felicia: google works :) (Gale) Arlie: you guys have never worked in a factory before have you (Gale) Arlie: there’s some real unethical stuff that goes on (Regine) hands Chanel a trophy (Arlie) Gale, of course ... thats how they make money (Gale) and people lose limbs or get killed (Felicia) excellent
Research Objective • Conversation disentanglement • New corpus • New annotator-agreement metrics
Application • Public chat • QA
Outline • Research Objective • Corpus • Annotator-Agreement Metrics • Disentanglement Method • Experiment
New Conversation Corpus • IRC (Internet Relay Chat) • Linux topic • Training: 706 utterances (2:06 hr) • Testing: 800 utterances (1:39 hr) • 7 university student annotators
Outline • Research Objective • Corpus • Annotator-Agreement Metrics • Disentanglement Method • Experiment
New Annotator-Agreement Metrics • 1-to-1 Accuracy • Pair up conversations from 2 annotators • Maximize overlap percentage • Local Agreement • Is each of previous k utterances from the same conversation as the current utterance? • Determine annotator agreement
Outline • Research Objective • Corpus • Annotator-Agreement Metrics • Disentanglement Method • Experiment
Automatic Conversation Identification • 2 Steps • Utterance pair judgment • Cluster
Utterance Pair Classification • Maximum Entropy Classifier
Features and Inside Test Results • Outside Test: Acc 68.2, Prec 53.3, Rec 71.3, F 60
Cluster Method • Window size n • Choose most similar preceding utterance within window or create a new conversation
Outline • Research Objective • Corpus • Annotator-Agreement Metrics • Disentanglement Method • Experiment
Automatic Conversation Identification Baselines • All different • All same • Blocks of k • k consecutive utterances • Pause of k • Within k seconds • Speaker • 1 conversation/speaker
Resource • http://cs.brown.edu/people/melsner