110 likes | 270 Views
Improving Lecture Speech Summarization using Rhetorical Information. Presenter: Shih-Hsiang Lin 02/21/2008. Introduction. Unlike conversational speech, lecture and presentations are planned speech Lecture speakers will follow a relatively rigid rhetorical structure
E N D
Improving Lecture Speech Summarization using Rhetorical Information Presenter: Shih-Hsiang Lin 02/21/2008
Introduction • Unlike conversational speech, lecture and presentations are planned speech • Lecture speakers will follow a relatively rigid rhetorical structure • overview (introduction) more detailed description (content) conclusions • Lecture speech is different from broadcast news (BNs) stylistically • A wide range of speaking styles in lecture speech • Unlike almost fixed anchors or reporters in BNs • A typical lecture speaker often sounds dull and monotonic • Unlike using prosody to emphasize important points in BNs • It remains as an open question whether systems trained to summarize BNs are directly applicable to lecture speech
Rhetorical Structure Characteristics in Lecture Speech • How to extract the rhetorical structure? • Lexical Evidence • Using term distribution • When a writer writes from subtopic to subtopic in a linear text, s/he generates sentences that are tightly linked together within a subtopic • When s/he proceeds to the next subtopic, the sentences that are generated are less related to the previous sentences, but they themselves are tightly linked again • Using sentence cohesiveness • The cohesiveness is measured by a cosine value between content word-frequency vectors consisting of more than a fixed number of content words • Acoustic/ Discourse Evidence
Rhetorical Structure Characteristics in Lecture Speech (cont.) • Using PCA projection of all acoustic/phonetic, lexical and discourse feature of lecture speech render the underlying rhetorical structure
Extractive Summarization of Lecture Speech binary classification problem
Discourse Feature • The probability distributions of words in texts can be adequately estimated by Poisson mixture • The Poisson Noun is based on the following assumptions • First, if a sentence contains new noun words, it probably contains new information • The noun word’s Poisson score varies according to its position • Second, if a noun word occurs frequently, it is likely to be more important than other noun words and the sentence with these high frequency noun words should be included in a summary p means that word k appeared in the p-th time within section j Number of noun words in sentence i, which belongs to section j
Experiments and Evaluation • 40 of the 60 well organized presentations together with power point files and manual transcription • 34 presentations that contain 6049 sentences as training set • The remaining 6 presentations that contain 1116 (Auto) or 1033 (Manual) sentences as held-out test set • Sentence boundary detection • Using HMM segmenter • 3 ~ 7 hmm states, each of the GMMs contains 256 components • silence, noise, mandarin initial speech, mandarin final speech and non English word speech events • Multiple passes ASR system • Word Accuracy : 69.7% and 70.3% accuracy for manual and automatic segmented sentences • ROUGE-L (longest common subsequence) as evaluation metrics
ROUGE-L • Longest Common Subsequence (LCS) • Given two sequences X and Y, a longest common subsequence of X and Y is a common subsequence with maximum length • Example 1.police killed the gunman 2.police kill the gunman 3.the gunman kill police • ROUGE-N: S2=S3 (“police”, “the gunman”) • ROUGE-L • S2=3/4 (“police the gunman”) • S3=2/4 (“the gunman”) • S2>S3
Results • By using lexical features, segmental summarizer yield the best performance • This shows the contribution of rhetorical structure in the lecture speech • Lexical Features rank higher than acoustic feature in all experiments • What is said is more important than how it is said • The discourse feature is even less important in the segmental summarizer than in the whole summarizer • This clearly shows that discourse feature from BNs are not applicable to lecture speech as they are based on sentence position 30% summarization ratio