1 / 14

Statistical Topic Models for Integrating and Analyzing Opinions in Blog articles

Statistical Topic Models for Integrating and Analyzing Opinions in Blog articles. Yue Lu Qiaozhu Mei ChengXiang Zhai. 190,451 posts. 4,773,658 results. Why Opinion Integration?. What have been said about Barack Obama? the health care reform? Hurricane Katrina? Al-Qaeda? .

mauve
Download Presentation

Statistical Topic Models for Integrating and Analyzing Opinions in Blog articles

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Topic Models for Integrating and Analyzing Opinions in Blog articles Yue Lu Qiaozhu Mei ChengXiangZhai

  2. 190,451 posts 4,773,658results Why Opinion Integration? • What have been said about Barack Obama? the health care reform? Hurricane Katrina? Al-Qaeda? How to digest all?

  3. Opinions Come in Two Kinds 4,773,658results 190,451 posts How to integrate and benefit from both? Q1

  4. Opinions Come with Context Source Author How to benefit from context? Q2 Time Location

  5. Topics B Statistical Topic Models: PLSA [Hofmann 99], [Zhai et al. 04] Topic model = unigram language model = multinomial distribution Document Generate a word in a document government0.3 response 0.2.. 1 d1 2 oil 0.1price 0.05 d2 w … dk k pray 0.2bless 0.15 Collection background Is 0.05the 0.04a 0.03 .. B

  6. Topics B PLSA Estimation Generate a word in a document Document ? ? 1 d1 2 ? d2 w Log-likelihood of the collection … ? dk k ? Collection background Is 0.05the 0.04a 0.03 .. B Estimated with Maximum Likelihood Estimator (MLE) through an EM algorithm

  7. Topics 1 - B B Exploiting Expert Opinions in PLSA How to integrate and benefit from both? Q1 [Lu & Zhai www08] Document Add as Dirichlet priors Governmentresponse r1 1 d1 Expert Opinions Oil price r2 2 Blog Opinions d2 w … dk k Collection background Is 0.05the 0.04a 0.03 .. MLE MAP B

  8. Topics 1 - B B Exploiting Opinion Context in PLSA How to benefit from context? Q2 Document [Mei et al. www06] Topic Coverage condition on context 1 c1 Year=06 d1 Spatiotemporal Context Year=08 c2 2 w Blog Opinions d2 P(i|time, location) … dk k Collection background P(i,location|time) B Is 0.05the 0.04a 0.03 .. P(time|i, location)

  9. Integration on Barack Obama

  10. Integration on Hurricane Katrina

  11. Spatiotemporal Analysis on Hurricane Katrina Snapshot of Topic Coverage P(i=Government Response,location|time) Hurricane Katrina

  12. Spatiotemporal Analysis on Hurricane Katrina P(time|i, location=Texas) Topic life cycle Hurricane Katrina

  13. Summary • Problem: opinion integration and analysis • Approaches: • Unsupervised statistical topic models • Domain independent, general and robust • Many potential applications: • Intelligence analysis • Public opinion tracking • … • Future Work: • System/toolkit building • More interactive support • More NLP: co-reference

  14. Thank You!

More Related