Improving Automatic Meeting Understanding by Leveraging Meeting Participant Behavior

Improving Automatic Meeting Understanding by Leveraging Meeting Participant Behavior Satanjeev Banerjee, Thesis Proposal. April 21, 2008

Using Human Knowledge • Knowledge of human experts is used to build systems that function under uncertainty • Often captured through in-lab data labeling • Another source of knowledge: Users of the system • Can provide subjective knowledge • System can adapt to the users and their information needs • Reduce data needed in the lab • Technical goal: Improve system performance by automatically extracting knowledge from users

Domain: Meetings • Problem: • Large parts of meetings contain unimportant information • Some small parts contain important information • How to retrieve the important information? • Impact goal: Help humans get information from meetings (Romano and Nunamaker, 2001) What information do people need from meetings?

Understanding Information Needs • Survey of 12 CMU faculty members • How often do you need information from past meetings? • On average, 1 missed-meeting, 1.5 attended-meeting a month • What information do you need? • Missed-meeting: “What was discussed about topic X?” • Attended-meeting: Detail question – “What was the accuracy?” • How do you get the information? • From notes if available – high satisfaction • If meeting missed – ask face-to-face (Banerjee, Rose & Rudnicky, 2005) Task 1: Detect agenda item being discussed Task 2: Identify utterances to include in notes

Existing Approaches to Accessing Meeting Information Classic supervised learning Unsupervised learning • Meeting recording and browsing • (Cutler, et al,02), (Ionescu, et al, 02), (Ehlen, et al, 07), (Waibel, et al, 98). • Automatic meeting understanding • Meeting transcription (Stolcke, et al, 2004), (Huggins-Daines, et al, 2007) • Meeting topic segmentation (Galley, et al, 2003), (Purver, et al, 2006) • Activity recognition through vision (Rybski & Veloso, 2004) • Action item detection (Ehlen, et al, 07) Our goal Extract high quality supervision ...from meeting participants (best judges of noteworthy info) ...during the meeting (when participants are most available) Meeting participants used after the meeting

Challenges for Supervision Extraction During the Meeting • Giving feedback costs the user time and effort • Creates a distraction from the user’s main task – participating in the meeting Our high-level approach • Develop supervision extraction mechanisms that help meeting participants do their task • Interpret participants’ responses as labeled data

Thesis Statement Develop approaches to extract high quality supervision from system users, by designing extraction mechanisms that help them do their own task, and interpret their actions as labeled data

Roadmap for the Rest of this Talk • Review of past strategies for supervision extraction • Approach: • Passive supervision extraction for agenda item labeling • Active supervision extraction to identify noteworthy utterances • Success criteria, contribution and timeline

Past Strategies for Extracting Supervision from Humans • Two types of strategies: Passive and Active • Passive: System does not choose which data points user will label • E.g.: Improving ASR from user corrections (Burke, et al, 06) • Active: System chooses which data points user will label • E.g.: Have user label traffic images as risky or not (Saunier, et al, 04) Past strategies | Passive approach | Active approach | Summary

Research Issue 1: How to Ask Users for Labels? • Categorical labels • Associate desktop documents with task label (Shen, et al, 07) • Label image of safe roads for robot navigation (Failes & Olsen, 03) • Item scores/rank • Rank report items for inclusion in summary (Garera, et al, 07) • Pick best schedule from system-provided choices (Weber, et al, 07) • Feedback on features: • Tag movies with new text features (Garden, et al, 05) • Identify terms that signify document similarity (Godbole, et al, 04) Past strategies | Passive approach | Active approach | Summary

Research Issue 2: How to Interpret User Actions as Feedback? Depends on similarity between user and system behavior • Interpretation simple when behaviors are similar • E.g.: Email classification (Cohen 96) • Interpretation may be difficult when user behavior and target system behavior are starkly different • E.g.: User corrections of ASR output (Burke, et al, 06) Past strategies | Passive approach | Active approach | Summary

Research Issue 3: How to Select Data Points for Label Query (Active Strategy)? • Typical active learning approach: • Goal: Minimize number of labels sought to reach target error • Approach: Choose data points most likely to improve learner • E.g.: Pick data points closest to decision boundary (Monteleoni, et al, 07) • Typical assumption: Human’s task is labeling • System user’s task is usually not same as labeling data Past strategies | Passive approach | Active approach | Summary

Our Overall Approach to Extracting Data from System Users • Goal: Extract high quality subjective labeled data from system users. • Passive approach: Design the interface to ease interpretation of user actions as feedback • Task: Label meeting segments with agenda item • Active approach: Develop label query mechanisms that: • Query for labels while helping the user do his task • Extract labeled data from user actions • Task: Identify noteworthy utterances in meetings

Talk Roadmap • Review of past strategies for supervision extraction • Approach: • Passive supervision extraction for agenda item labeling • Active supervision extraction to identify noteworthy utterances • Success criteria, contribution and timeline

Passive Supervision: General Approach • Goal: Design the interface to enable interpretation of user actions as feedback • Recipe: Identify kind of labeled data needed Target a user task Find relationship between user task and data needed Build interface for user task that captures the relationship Past strategies | Passive approach | Active approach | Summary

Supervision for Agenda Item Detection • Automatically detect agenda item being discussed Labeled data User task Meeting segments labeled with agenda item Note taking during meetings Relationship Most notes refer to discussions in preceding segment A note and its related segment belong to same agenda item Note taking interface 1. Time stamp speech and notes 2. Enable participants to label notes with agenda item Past strategies | Passive approach | Active approach | Summary

Speech recognition research status Topic detection research status FSGs Insert Agenda Shared note taking area Personal notes – not shared Past strategies | Passive approach | Active approach | Summary

Getting Segmentation from Notes Speech recognition research status 300 Topic detection research status 700 Speech recognition research status Past strategies | Passive approach | Active approach | Summary

Evaluate the Segmentation • How accurate is the extracted segmentation? • Compare to human annotator • Also compare to standard topic segmentation algorithms • Evaluation metric: Pk • For every pair of time points k seconds apart, ask: • Are the two points in the same segment or not, in the reference? • Are the two points in the same segment or not, in the hypothesis? • Pk = # time pairs where hypothesis and reference disagree Total # of time point pairs in the meeting Past strategies | Passive approach | Active approach | Summary

SmartNotes Deployment in Real Meetings • Has been used in 75 real meetings • 16 unique participants overall • 4 sequences of meetings • Sequence = 3 or more longitudinal meetings Past strategies | Passive approach | Active approach | Summary

Data for Evaluation • Data: 10 consecutive related meetings • Reference segmentation: Meetings segmented into agenda items by two different annotators. • Inter-annotator agreement: Pk = 0.062 Past strategies | Passive approach | Active approach | Summary

Results • Baseline: TextTiling (Hearst 97) • State of the art: (Purver, et al, 2006) Significant Not significant Pk Past strategies | Passive approach | Active approach | Summary

Does Agenda Item Labeling Help Retrieve Information Faster? • 2 10-minute meetings, manually labeled with agenda items • 5 questions prepared for each meeting • Questions prepared without access to agenda items • 16 subjects, not participants of the test meetings • Within subjects user study • Experimental manipulation: Access to segmentation versus no segmentation Past strategies | Passive approach | Active approach | Summary

Minutes to Complete the Task Significant Past strategies | Passive approach | Active approach | Summary

Shown So Far • Method of extracting meeting segments labeled with agenda item from note taking • Resulting data produces high quality segmentation • Likely to help participants retrieve information faster • Next: Learn to label meetings that don’t have notes Past strategies | Passive approach | Active approach | Summary

Proposed Task: Learn to Label Related Meetings that Don’t Have Notes • Plan: Implement language model based detection similar to (Spitters & Kraaiij, 2001). • Train agenda item – specific language models on automatically extracted labeled meeting segments • Perform segmentation similar to (Purver, et al, 06) • Label new meeting segments with agenda item whose LM has the lowest perplexity Past strategies | Passive approach | Active approach | Summary

Proposed Evaluation • Evaluate agenda item labeling of meeting with no notes • 3 real meeting sequences with 10 meetings each • For each meeting iin each sequence • Train agenda item labeler on automatically extracted labeled data from previous meetings in same sequence • Compute labeling accuracy against manual labels • Show improvement in accuracy from meeting to meeting • Baseline: Unsupervised segmentation + text matching between speech and agenda item label text • Evaluate effect on retrieving information • Ask users to answer questions from each meeting • With agenda item labeling output by improved labeler, versus • With agenda item labeling output by baseline labeler Past strategies | Passive approach | Active approach | Summary

Active Supervision • System goal: Select data points, and query user for labels • In active learning, human’s task is to provide the labels • But system user’s task may be very different from labeling data • General approach • Design query mechanisms such that • Each label query also helps the user do his own task • The user’s response to the query can be interpreted as a label • Choose data points to query by balancing • Estimated benefit of query to user • Estimated benefit of label to learner Past strategies | Passive approach | Active approach | Summary

Task: Noteworthy Utterance Detection • Goal: Identify noteworthy utterances – utterances that participants would include in notes • Labeled data needed: Utterances labeled as either “noteworthy” or “not noteworthy” Past strategies | Passive approach | Active approach | Summary

Extracting Labeled Data • Noteworthy utterance detector • Label query mechanism • Notes assistance: Suggest utterances for inclusion in notes during the meeting • Helps participants take notes • Interpret participants’ acceptances / rejections as “noteworthy” / “not noteworthy” labels • Method of choosing utterances for suggestion • Benefit to user’s note taking • Benefit to learner (detector) from user’s acceptance/rejection Proposed Completed Proposed Past strategies | Passive approach | Active approach | Summary

Proposed: Noteworthy Utterance Detector Binary classification of utterances as noteworthy or not • Support Vector Machine classifier • Features: • Lexical: Keywords, tf-idf, named entities, numbers • Prosodic: speaking rate, f0 max/min • Agenda item being discussed • Structural: Speaker identity, utterances since last accepted suggestion • Similar to meeting summarization work of (Zhu & Penn, 2006) Past strategies | Passive approach | Active approach | Summary

Mechanism 1: Direct Suggestion Fix the problem with emailing Past strategies | Passive approach | Active approach | Summary

Mechanism 2: “Sushi Boat” pilot testing has been successful most participants took twenty minutes ron took much longer to finish tasks there was no crash Past strategies | Passive approach | Active approach | Summary

Differences between The Mechanisms • Direct suggestion • User can provide accept/reject label • Higher cost for the user if suggestion is not noteworthy • Sushi boat suggestion • User only provides accept labels • Lower cost for the user Past strategies | Passive approach | Active approach | Summary

Will Participants Accept Suggestions? • Wizard of Oz study • Wizard listened to audio and suggested text • 6meetings – 2 direct mechanism, 4 sushi boat mechanism Past strategies | Passive approach | Active approach | Summary

Percentage of Notes from Sushi Boat Past strategies | Passive approach | Active approach | Summary

Method of Choosing Utterances for Suggestion • One idea: Pick utterances that either have high benefit for detector, or high benefit for the user • Most beneficial for detector: Least confident utterances • Most beneficial for user: Noteworthy utterances with high conf • Does not take into account user’s past acceptance pattern • Our approach: • Estimate and track user’s likelihood of acceptance • Pick utterances that either have high detector benefit, or is very likely to be accepted Past strategies | Passive approach | Active approach | Summary

Estimating Likelihood of Acceptance • Features: • Estimated user benefit of suggested utterance • Benefit(utt) = where T(utt) = time to type utterance, R(utt) = time to read utterance • # suggestions, acceptances, rejections in this and previous meetings • Amount of speech in preceding window of time • Time since last suggestion • Combine features using logistic regression • Learn per participant from past acceptances/rejections T(utt) – R(utt)), if utt is noteworthy according to detector – R(utt)), if utt is not noteworthy according to detector Past strategies | Passive approach | Active approach | Summary

Overall Algorithm for Choosing Utterances for Direct Suggestion Given: An utterance and a participant Decision to make: Suggest utterance to participant? Estimate benefit of utterance label to detector Estimate likelihood of acceptance Combine > threshold? No Don’t suggest Yes Suggest utterance to participant Past strategies | Passive approach | Active approach | Summary

Learning Threshold and Combination Wts • Train on WoZ data • Split meetings into development and test set • For each parameter setting • Select utterances for suggestion to user in development set • Compute acceptance rate by comparing against those accepted by the user in the meeting • Of those shown, use acceptances and rejections to retrain utterance detector • Evaluate utterance detector on test set • Pick parameter setting with acceptable tradeoff between utterance detector error rate and acceptance rate Past strategies | Passive approach | Active approach | Summary

Proposed Evaluation • Evaluate improvement in noteworthy utterance detection • 3 real meeting sequences with 15 meetings each • Initial noteworthy detector trained on prior data • Retrain over first 10 meetings by suggesting notes • Test over next 5 • Evaluate: After each test meeting, ask participants to grade automatically identified noteworthy utterances • Baseline: Grade utterances identified by prior-trained detector • Evaluate effect on retrieving information • Ask users to answer questions from test meetings • With utterances identified by detector trained on 10 meetings, vs. • With utterances identified by prior-trained detector Past strategies | Passive approach | Active approach | Summary

Thesis Success Criteria • Show agenda item labeling improves with labeled data automatically extracted from notes • Show participants can retrieve information faster • Show noteworthy utterance detection improves with actively extracted labeled data • Show participants retrieve information faster Past strategies | Passive approach | Active approach | Summary

Expected Technical Contribution • Framework to actively acquire data labels from end users • Learning to identify noteworthy utterances by suggesting notes to meeting participants. • Improving topic labeling of meetings by acquiring labeled data from note taking Past strategies | Passive approach | Active approach | Summary

Summary: Tasks Completed/Proposed Past strategies | Passive approach | Active approach | Summary

Proposal Timeline Past strategies | Passive approach | Active approach | Summary

Thank you!

Improving Automatic Meeting Understanding by Leveraging Meeting Participant Behavior