280 likes | 341 Views
Enhancing Diversiy, Coverage and Balance for Summarization through Structure Learning. Outline. Introduction Diversity, Coverage and Balance Optimization Problem and Structure Learning Framework Experiments Conclusion. Introduction. Example for Search Results. Example for News Browsing.
E N D
Enhancing Diversiy, Coverage and Balance for Summarization through Structure Learning
Outline • Introduction • Diversity, Coverage and Balance • Optimization Problem and Structure Learning Framework • Experiments • Conclusion
Traditional summarization approaches Robert H. Bork, who once hoped for a Supreme Court seat, instead stood before the nation's highest court Monday. • Consider the summarization task as a binary classification problem • 0-1 loss function • Raise serious redundancy, unbalance and low recall problems. He was there in the capacity of a lawyer _ apparently making him the first defeated Supreme Court nominee ever to argue before the justices. Representing Citibank in a big-stakes battle, Bork argued that U.S. banks with branch offices overseas should not be required to pay depositors after foreign governments seize or freeze those accounts. Robert H. Bork, who once hoped for a Supreme Court seat, instead stood before the nation's highest court Monday. He was branded a conservative extremist by opponents. Bork, nominated by then-President Reagan, has said he was the victim of a campaign of lies and distortions led by liberals. There were no references to that fight Monday as Bork engaged in debate with the justices and opposing lawyers over the complexities of federal banking law and related matters. Bork was treated much the same as any attorney who appears before the justices. They questioned him vigorously, occasionally interrupting him for clarification and elaboration. Justice Anthony M. Kennedy, who occupies the seat that eluded Bork, directed a few questions at Bork. Bork, 63, a former federal appeals court judge, is a fellow at the American Enterprise Institute and will begin teaching constitutional law in the fall at the George Mason University law school in Arlington, Va. He will be paid $25,000 a year to teach one course each semester as a part-time professor. Bork was only the sixth man this century to be denied a Supreme Court seat by the Senate and the 26th in its history.
Three Key Requirements in Summarization • Diversity: less redundant sentences • Coverage: little information loss • Balance: emphasize various aspects of the document in a balance way
Example • AP890616-0912 from DUC2001 • Generally three topics (views from different perspectives): • Michael Milken himself • Involved people • The company 13 AP890616-0192 4826 <DOC> <DOCNO> AP890616-0192 </DOCNO> <FILEID>AP-NR-06-16-89 0237EST</FILEID> <FIRST>r f PM-Milken Bjt 06-16 0734</FIRST> <SECOND>PM-Milken, Bjt,0757</SECOND> <HEAD>Indicted Bond Trader Quits Drexel, Sets Up Own Firm</HEAD> <BYLINE>By STEFAN FATSIS</BYLINE> <BYLINE>AP Business Writer</BYLINE> <DATELINE>NEW YORK (AP) </DATELINE> <TEXT> Michael Milken, the fallen Drexel Burnham Lambert financier, is striking out on his own. But what isn't clear is whether loyal ex-colleagues will follow the Pied Piper of junk bonds to his new firm _ and how long the venture will last with Milken facing a lengthy jail term. ……
Example for Diversity Milken himself • A: Milken, 42, resigned Thursday after 19 years at Drexel, where he began a wildly successful career that helped reshape corporate America in the 1980s through the pioneering use of low-grade securities called junk bonds. B: Milken, who made a reported $550 million in 1987, said he is forming a consulting firm to assist companies that want to raise money to start up, grow or stay in business. C: People involved in Milken's plans for the new firm, International Capital Assets Group, said Milken does not intend to raid Drexel's Beverly Hills, Calif., junk bond division, which he founded and ran until his March indictment. • B & C is better than A & B • Select sentences belonging to different topics Involved People
Example for Coverage Relevant • A: Milken, who made a reported $550 million in 1987, said he is forming a consulting firm to assist companies that want to raise money to start up, grow or stay in business. B: Milken joined Drexel full-time in 1970 after graduating from the University of Pennsylvania's Wharton School. • A is better than B • Select sentences more relevant to one of the topics. Irrelevant
Example for Balance Milken himself • A: Milken, who made a reported $550 million in 1987, said he is forming a consulting firm to assist companies that want to raise money to start up, grow or stay in business. B: But he faces $1.85 billion in forfeitures of alleged illegal profits and a lengthy jail term if convicted on a 98-count fraud and racketeering indictment, the government's largest securities crime prosecution to date. C: A Drexel official, speaking on condition of anonymity, said the Wall Street giant does not anticipate conflicts with Milken's new firm because it will not be in the brokerage business. D: “Michael Milken made many important contributions to Drexel Burnham, and his resignation, although not unexpected, is a sad event,” Drexel stated. • A & B & C is better than B & C & D • For each topic, select the same percentage of sentences according to its corresponding weight. The company
Problem Formulation • Predicate a summary: y* = argmaxy f(x,y) • Learning a model: f(x,y) = <w,ψ(x,y)> • Joint feature representation: ψ(x,y) • Loss function: Δ(y, y’)
Structural Support Vector Machines (Tsochantaridis et al., 2005) • Large margin approach: • The parameter c: controls the tradeoff between model complexity and the sum of slacks variables • The constraints enforce the ground-truth summary a higher score.
Constraint for Diversity • Diversity: little overlap • The sum of summary sentences’ unique score should be no more than the overall score when they are regarded as a whole set. • Each sentence should focus on different subtopics
Constraint for Coverage • Coverage: cover all subtopics as much as possible • Vector v: a sentence’s coverage of the subtopic set. • Subtopic Coverage Degree
Subtopic Set • A subtopic set T for each document, each subtopic is associated with a set of words • cover(t, s)is employed to define sentence s’s coverage of subtopic t: cover(t, s) represents the proportion of the words in the subtopic t that also appear in the sentence s.
Example • Each topic may owns several subtopics, which indicates its importance • Topic: Michael Milken himself • Subtopic: Milken’s contribution, Milken’s fallen, Milken’s current situation. • For subtopic t: Milken’s contribution • a: Milken, who made a reported $550 million in 1987, said he is forming a consulting firm to assist companies that want to raise money to start up, grow or stay in business. b: But he faces $1.85 billion in forfeitures of alleged illegal profits and a lengthy jail term if convicted on a 98-count fraud and racketeering indictment, the government's largest securities crime prosecution to date. c: “I am naturally disappointed to be forced to leave Drexel as part of the firm's settlement with the government, but I look forward to the opportunity of helping people build companies,” Michael Milken said in a statement. cover(t,a)=1; cover(t,b)=0; cover(t,c)=0.16.
Constraint for Balance • Balance: relatively equal coverage for each subtopic • Variation of subtopics’ coverage:
Structure Learning • Independence Graphs • Measure the similarity between sentences • Shrink the searching space • Learning Algorithm • Cutting plane algorithm • Making Prediction
Experiments Setup • Dataset: DUC2001 • Bigset, Docset1, Docset2 • Bigset: contains147 document-summary pairs from DUC2001 dataset • Docset1, Docset2: two main subset of Bigset • Evaluation Metric • F1 Evaluation • Rouge Evaluation • Comparable to F1 evaluation • ROUGE-N-R, ROUGE-N-P, ROUGE-N-F
Overall Performance • Our approach performs best • Results on smaller data set show the robustness of our approach
Constraint Selection • Coverage-biased constraint makes the greatest contribution to summarization. • The model trained with all three constraints performs the best
Conclusion • Diversity, Coverage and Balance • Prove to be of great importance to the summarization task. • Structural Learning Framework • Structural SVM • Three constraints enforce diversity, coverage and balance seperately • Independence graphs and Cutting plane algorithm • Experimental Results • Our approach outperforms state-of-art ones. • The constraint imporve the preformance significantly.