1 / 1

Automatically Detecting Action Items in Audio Meeting Records

Automatically Detecting Action Items in Audio Meeting Records. William Morgan, Pi-Chuan Chang, Surabhi Gupta, Jason M. Brenier Natural Language Processing Group Department of Computer Science Stanford University, USA. Summary. Results.

Download Presentation

Automatically Detecting Action Items in Audio Meeting Records

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatically Detecting Action Items in Audio Meeting Records William Morgan, Pi-Chuan Chang, Surabhi Gupta, Jason M. Brenier Natural Language Processing Group Department of Computer Science Stanford University, USA Summary Results • TASK: Detection of action items (decisions made during a meeting that require post meeting attention) from multi-party audio meeting recordings. • APPROACH: Maximum entropy model trained on lexical, syntactic, temporal, semantic and prosodic features. • RESULTS: Combination of features described below achieves 31.92 F score Data • ICSI Meeting Corpus with annotations for action items (Gruenstein et al. 2005) • Inter-annotator agreement with 2 annotators: 0.364 kappa score • Imbalanced data: 590 action item utterances in 24,250 total utterances (2.4%) • We focus on top 15 meetings as ranked by kappa. Min kappa = 0.435 • Gold Standard for results: union of action items from both annotators Analysis Formulation unigram+bigram Best-performing unigram • Binary classification task on a per utterance level • Each utterance is either yes/ no for action item. • Imbalanced binary classification • F measure is more suitable for evaluation than accuracy. • If every utterance is marked false, accuracy = 97.5% and F score = 0 • If we were to use a weighted coin in proportion to the number of positive examples, accuracy = 95.25% and P = R = F = 2.4%. • Maximum entropy model Features • Immediate lexical features -- word unigrams and bigrams • Contextual lexical features -- lexical features for neighboring utterances • Syntactic features -- POS tags (UH, MD, NN*, VB*, VBD) • Prosodic features -- intensity, pitch, duration • Temporal features -- duration, time of occurrence until the end of the meeting • General semantic features -- temporal expressions from Identifinder (e.g. next Tuesday) • Dialog-specific semantic -- Dialog acts (56 fine DA’s e.g. rhetorical features question, 7 coarse DA’s e.g. statement, question) Conclusion • Temporal, contextual and fine DA features were found to be most useful. • Raw system performance numbers are low. But relative usefulness of features towards this task is indicative of their usefulness in more mature corpora and related tasks. • Thanks to Dan Jurafsky, Chris Manning, Stanley Peters, Matthew Purver and all the anonymous reviewers.

More Related