1 / 22

Exploiting Structured Ontology to Organize Scattered Online Opinions

Exploiting Structured Ontology to Organize Scattered Online Opinions. Yue Lu , Huizhong Duan , Hongning Wang, ChengXiang Zhai University of Illinois at Urbana-Champaign. August 24, COLING’2010 Beijing, China. Online Opinions: Valuable Resource. …. Need to organize them

rian
Download Presentation

Exploiting Structured Ontology to Organize Scattered Online Opinions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploiting Structured Ontology to Organize Scattered Online Opinions Yue Lu, HuizhongDuan, Hongning Wang, ChengXiangZhai University of Illinois at Urbana-Champaign • August 24, COLING’2010 • Beijing, China

  2. Online Opinions: Valuable Resource … • Need to organize them • in a meaningful way!

  3. Aspect Summarization • What are “good aspects”? • 1. Concise • 2. Relevant to topic • 3. Captures major opinions • 4. Reasonable order

  4. Existing Work Clustering + Phrase Selection What are “good aspects”? 1. Concise 2. Relevant to topic 3. Captures major opinions 4. Reasonable order [Chen&Dumais 2000] NA • Our idea: • use structured ontology

  5. Why Using Ontology? Clustering based Ontology based What are “good aspects”? 1. Concise 2. Relevant to topic 3. Captures major opinions 4. Reasonable order NA • In addition: • Great coverage • 12 millions of entities, e.g. person, place, or thing • Consistently growing • Anyone can contribute data

  6. Problem Definition • Two Main Tasks: • - Aspect Selection • - Aspect Ordering Ontology (>50 aspects) Topic = “Abraham Lincoln” Online Opinion Sentences Output Professions Parents Quotations Children Date of Birth Date of Birth … Books written … Place of Death Professions Place of Birth Spouse Place of Death Selected Subset of Aspects Selected Matching Opinions Ordered to optimize readability

  7. Aspect Selection: Task Definition What are “good aspects”? • 3. Captures major opinions Aligned relevant opinions KL-divergence retrieval model Query: Professions … Professions Collection: … Parents … … Task: Select a subset of K aspects

  8. Aspect Selection: Methods (1) (2) Size=800 … • Size-based • Size = Number of aligned relevant opinions • Select K aspects of largest size • Opinion Coverage-based • Reduce redundancy, maximum coverage • Select K aspects sequentially (max cover problem) Professions 1 2 3 Size=600 … Position 4 5 3 Size=500 … Parents 4 5 6

  9. Aspect Selection: Method (3)Conditional Entropy-based Use a greedy algorithm to approximate the solution Collection: A = argmin H(C|A) p(Ai,Ci) = argmin - ∑i p(Ai,Ci) log ---------- p(Ai) … Clustering, e.g. K-means … … C1 A1 Professions Aspect Subset: A Clusters: C … … C2 A2 Position … … C3 A3 Parents

  10. Aspect Ordering: Task Definition What are “good aspects”? • 4. Reasonable order Ordered Un-Ordered Aspect Subset Date of Birth Place of Death Professions Date of Birth Professions Quotations Quotations Place of Death

  11. Aspect Ordering: Methods • Ontology Order • Use the order that aspects appear in ontology • Coherence Order • Follow the order of aligned opinions in their original articles (e.g. blog article, customer review)

  12. Aspect Ordering: Coherence Order Original Articles A1 Place of Death … A2 Date of Birth Coherence(A1, A2)  #( is before ) Coherence(A2, A1)  #( is before ) Use a greedy algorithm to approximate the solution So, Coherence(A2, A1) > Coherence (A1, A2) Π(A) = argmax ∑ Ai before AjCoherence(Ai, Aj)

  13. Experiments: Data Sets • Ontology • Freebase • Opinions • Blog entries and CNET customer reviews

  14. Sample Results: Sony Cybershot DSC-W200

  15. Aspect Selection: Evaluation Measures = 2/3 • Aspect Coverage (AC) • Aspect Precision (AP) = Jaccard similarity • Average Aspect Precision (AAP) = 0.625 • = 0.42 A1 Professions C1 J(A3,C1)=2/4 AP=0.5 J(A1,C2)=1 A2 AP=0.75 C2 Position J(A2,C2)=2/4 A3 C3 AP=0 Parents

  16. Conditional Entropy-based method provides best trade-off for Aspect Selection US Presidents Digital Cameras

  17. Aspect Ordering: Human Labeling Aspect subset size = K X 3 Human Agreement Cluster Constraints Parents Spouse Professions Spouse Children Parents Quotations … Date of Birth Party Positions … X 3 Order Constraints Date of Birth Date of Death Education Positions Spouse Children X 3 …

  18. Aspect Ordering: Measures Cluster Constraints Parents Spouse Children Cluster Precision = 0.5 Cluster Penalty = 1.25 Party Positions Is this pair presented together in the output? # aspects placed between this pair in the output? 1 0 Parents Spouse 0 2 Parents Children 1 0 Children Spouse 0 3 Party Positions

  19. Aspect Ordering: Evaluation Results Measures: Cluster Precision Higher is better Cluster Penalty Lower is better

  20. Aspect Ordering: Evaluation Results Is this order pair preserved in the output? Order Constraints Higher is better 1 Date of Birth Date of Death Order Precision = 0.67 Education Positions 0 Spouse Children 1

  21. Conclusions • Novel Problem: exploit ontology for structured organization of online opinions • Aspect selection • Aspect ordering • Evaluation: US presidents and digital cameras • Conditional Entropy-based aspect selection • Coherence ordering • Future Directions: • New aspect suggestion for ontology • Better alignment of opinion sentences and aspects • Ontology + well-written articles

  22. Thank you!&Questions?

More Related