1 / 11

Improving Lecture Speech Summarization using Rhetorical Information

Improving Lecture Speech Summarization using Rhetorical Information. Presenter: Shih-Hsiang Lin 02/21/2008. Introduction. Unlike conversational speech, lecture and presentations are planned speech Lecture speakers will follow a relatively rigid rhetorical structure

sanjiv
Download Presentation

Improving Lecture Speech Summarization using Rhetorical Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving Lecture Speech Summarization using Rhetorical Information Presenter: Shih-Hsiang Lin 02/21/2008

  2. Introduction • Unlike conversational speech, lecture and presentations are planned speech • Lecture speakers will follow a relatively rigid rhetorical structure • overview (introduction)  more detailed description (content)  conclusions • Lecture speech is different from broadcast news (BNs) stylistically • A wide range of speaking styles in lecture speech • Unlike almost fixed anchors or reporters in BNs • A typical lecture speaker often sounds dull and monotonic • Unlike using prosody to emphasize important points in BNs • It remains as an open question whether systems trained to summarize BNs are directly applicable to lecture speech

  3. Rhetorical Structure Characteristics in Lecture Speech • How to extract the rhetorical structure? • Lexical Evidence • Using term distribution • When a writer writes from subtopic to subtopic in a linear text, s/he generates sentences that are tightly linked together within a subtopic • When s/he proceeds to the next subtopic, the sentences that are generated are less related to the previous sentences, but they themselves are tightly linked again • Using sentence cohesiveness • The cohesiveness is measured by a cosine value between content word-frequency vectors consisting of more than a fixed number of content words • Acoustic/ Discourse Evidence

  4. Rhetorical Structure Characteristics in Lecture Speech (cont.) • Using PCA projection of all acoustic/phonetic, lexical and discourse feature of lecture speech render the underlying rhetorical structure

  5. Extractive Summarization of Lecture Speech binary classification problem

  6. Acoustic/Phonetic/Lexical Features

  7. Discourse Feature • The probability distributions of words in texts can be adequately estimated by Poisson mixture • The Poisson Noun is based on the following assumptions • First, if a sentence contains new noun words, it probably contains new information • The noun word’s Poisson score varies according to its position • Second, if a noun word occurs frequently, it is likely to be more important than other noun words and the sentence with these high frequency noun words should be included in a summary p means that word k appeared in the p-th time within section j Number of noun words in sentence i, which belongs to section j

  8. Extraction of Rhetorical Structure

  9. Experiments and Evaluation • 40 of the 60 well organized presentations together with power point files and manual transcription • 34 presentations that contain 6049 sentences as training set • The remaining 6 presentations that contain 1116 (Auto) or 1033 (Manual) sentences as held-out test set • Sentence boundary detection • Using HMM segmenter • 3 ~ 7 hmm states, each of the GMMs contains 256 components • silence, noise, mandarin initial speech, mandarin final speech and non English word speech events • Multiple passes ASR system • Word Accuracy : 69.7% and 70.3% accuracy for manual and automatic segmented sentences • ROUGE-L (longest common subsequence) as evaluation metrics

  10. ROUGE-L • Longest Common Subsequence (LCS) • Given two sequences X and Y, a longest common subsequence of X and Y is a common subsequence with maximum length • Example 1.police killed the gunman 2.police kill the gunman 3.the gunman kill police • ROUGE-N: S2=S3 (“police”, “the gunman”) • ROUGE-L • S2=3/4 (“police the gunman”) • S3=2/4 (“the gunman”) • S2>S3

  11. Results • By using lexical features, segmental summarizer yield the best performance • This shows the contribution of rhetorical structure in the lecture speech • Lexical Features rank higher than acoustic feature in all experiments • What is said is more important than how it is said • The discourse feature is even less important in the segmental summarizer than in the whole summarizer • This clearly shows that discourse feature from BNs are not applicable to lecture speech as they are based on sentence position 30% summarization ratio

More Related