330 likes | 566 Views
Auto-Summarization of Audio-Video Presentations. Li-wei He, Elizabeth Sanocki Anoop Gupta, Jonathan Grudin Collaboration and Multimedia Group Microsoft Research. Motivation. On-demand multimedia is becoming pervasive Corporate training and communication
E N D
Auto-Summarization of Audio-Video Presentations Li-wei He, Elizabeth Sanocki Anoop Gupta, Jonathan Grudin Collaboration and Multimedia Group Microsoft Research ACM Multimedia 99
Motivation • On-demand multimedia is becoming pervasive • Corporate training and communication • At Microsoft, over 360 courses online in two years • Research seminars • Microsoft Research archives about 2 talks daily ACM Multimedia 99
Motivation (Cont.) • Effective summarization and browsing techniques can help viewers utilize time better • Audio-video different from text • Many approaches possible • Time-compression, indexes, highlights, … • This talk focuses on: • Informational presentations • Automatic summarization methods ACM Multimedia 99
What Is a Video Summary? • Assembled from segments of the original ACM Multimedia 99
The 4 C’s of a Good Summary • Conciseness: as short as possible • Coverage: covers key points • Context: defines terms before using them • Coherence: flows naturally and fluidly ACM Multimedia 99
Talk Outline • Introduction • Automatic summarization • Sources of information in A/V presentations • Three algorithms • Evaluation ACM Multimedia 99
Sources of Information • Audio and video • Pitch and pause information • Speaker actions • Slide-transition points • End-user actions • Video segments watched by earlier viewers ACM Multimedia 99
Auto-summarization Methods ACM Multimedia 99
1. Slide-based Method (S) • Rationale • Beginning of a slide marks a new topic • Time devoted to slide indicates its importance • Algorithm • First N% of video for each slide ACM Multimedia 99
2. Pitch-based Method (P) • Rationale • Pitch activity indicates the speaker’s emphasis • Algorithm (based on Arons ISSLP 94) • Compute pitch for every 1ms frame • Count the number of frames above a threshold in 15 second windows • Select the windows with the most count ACM Multimedia 99
3. Combined Method (SPU) • The amount of time that previous viewers spent on a slide indicates importance ACM Multimedia 99
Average Viewer Count of Slide N Importance of Slide N = Average Viewer Count of Slide N-1 3. Combined Method (SPU) • Algorithm • Compute importance measure for each slide • Allocate summary time for each slide according to the importance measure • Use pitch-based algorithm to pick the segments in each slide ACM Multimedia 99
Talk Outline • Introduction • Automatic summarization • Evaluation • Experimental design • Results ACM Multimedia 99
Experimental Design • To compare summarization techniques • Original presenters (authors) created summaries (A) as gold standard • Authors wrote quiz questions that covered the content of summaries • Objective measure: quiz score improvement after watching a summary • Subjective measures: user survey ACM Multimedia 99
Experimental Design (Cont.) • 4 summary types (S, P, SPU, A) • 4 talks chosen from Microsoft training site • 24 Microsoft employees were subjects • Summary types and talks are counter-balanced within each subject ACM Multimedia 99
Demo Summary ACM Multimedia 99
Quiz Score Improvement • As expected, author-created summaries did best • No significant difference among the automatic methods ACM Multimedia 99
Survey Rating Results • A >> SPU > P = S ACM Multimedia 99
Percent of Value Derived • From slide content: 46% • From audio content: 36% • From video content: 18% ACM Multimedia 99
Interesting Sequence Effect ACM Multimedia 99
Conclusions • Ability to skim/browse will be key to wide use • Automated methods can add significant value • Add domain knowledge is important • Increasing acceptance over time • Evaluation is a key but very difficult ACM Multimedia 99
Conclusions (Cont.) • Getting the human into the loop • Speakers • End-users as a group • E.g. collaborative filtering • End-users as an individual • E.g. interactive browsing • Visit us at: http://research.microsoft.com/coet ACM Multimedia 99
Interface of a Typical Talk Video VCR-like controls Slides Table of content ACM Multimedia 99
Summary Characteristics • Talks were from MS internal training site • UI Design, Internet Explorer, Dynamic HTML, Microsoft Transaction Server • Average length • 20% to 25% of the original • 10 to 14 minutes • Overlap with author-created summaries was no better than chance ACM Multimedia 99
Survey on the Summary Just Watched • Concise: It captured the essence of the talk without using too many sentences • Coverage: My confidence that it covered the key points of the talk is … • Context: It is clear and easy to understand • Coherent: It provided reasonable context, transitions, and sentence flow ACM Multimedia 99
Survey Rating Results • A >> SPU > P = S ACM Multimedia 99
Information Not Used • Spoken text content • Speaker gestures ACM Multimedia 99
Talk Outline • Introduction • Motivation • Definition of a video summary • Attributes of a good summary • Automatic summarization • Evaluation ACM Multimedia 99
Viewers Over Time for One Talk • Viewer number decreases overall and within each slide ACM Multimedia 99
Average Viewer Count of Slide N Importance of Slide N = Average Viewer Count of Slide N-1 Importance Measure ACM Multimedia 99
Author-created Summary (A) • Original presenters (authors) were asked to produce summaries of the talks • Author marked the text transcript • Video summaries were generated manually by aligning the video with the marked portions ACM Multimedia 99
Summary • Automatic algorithms performed respectably • “That’s pretty cool for a computer. I thought someone had sat down and made them” • SPU was preferred over S and P • Will viewers get used to auto summary? ACM Multimedia 99
Future Work • Compare audio/video and text summaries • Interactive and intelligent video browser • Visit us at http://research.microsoft.com/coet ACM Multimedia 99