1 / 34

Rated Aspect Summarization of Short Comments

Rated Aspect Summarization of Short Comments. Yue Lu, ChengXiang Zhai, and Neel Sundaresan Presented by: Sapan Shah. 1. Web 2.0  Opinions Everywhere. Novotel. iPhone. Sushi Kame. Overall Rating. ……. 2. Seller’s Feedback on eBay. 23,385 Feedback received.

blue
Download Presentation

Rated Aspect Summarization of Short Comments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rated Aspect Summarization of Short Comments Yue Lu,ChengXiang Zhai, and Neel Sundaresan Presented by: Sapan Shah 1

  2. Web 2.0  Opinions Everywhere Novotel iPhone Sushi Kame Overall Rating …… 2

  3. Seller’s Feedback on eBay 23,385 Feedback received Very fast shipping and awesome price!!! 3

  4. Need More Specific Aspects! Fast shipping Is this seller rated high/low mainly because of service? Good service Which seller provides fast shipping? 4

  5. Rated Aspect Summarization 23,385 Feedback received Representative Phrase Aspect Aspect Rating 5 Support Information Challenges: • How to identify coherent aspects? with user interest? • How to accurately rate each aspect? • How to get meaningful phrases supporting the ratings? 5

  6. Related Work • Review summarization • Unsupervised feature extraction + opinion polarity identification:[Hu&Liu 04], OPINE [Popescu&Etzioni 05], … • Supervised aspect extraction: [Zhuang et al] … • Sentiment classification • Binary classification:[Turney02] [Pang&Lee02] [Kim&Hovy04] [Cui et al06] … • Rating classification: [Pang&Lee05] [Snyder&Barzilay07] … • Hidden aspect discovery • [Hofmann99] [Blei et al03] [Zhai et al04] [Li&McCallum06] [Titov&McDonald08]… 6

  7. Overall Approach Step2: Aspect Rating Prediction Step3: Extract Representative Phrases Step1: Aspect Discovery and Clustering 7 7

  8. Preprocessing of Short Comments Source Modifier (opinion)‏ Head Term (feature)‏ 1 fast shipping awesome price 2 great business honest seller Very fast shipping and awesome price!!! Comment 1 Great business, honest seller Comment 2 Shallow parsing 8

  9. Step1: Aspect Discovery & Clustering Step2: Aspect Rating Prediction Step3: Extract Representative Phrases Step1: Aspect Discovery and Clustering 9 9

  10. Method(1) Head Term Clustering Source Modifier Head Term 1 fast shipping honest seller 2 fast shipping quick delivery reliable seller Clustering: e.g. k-means Modifiers Head Term • fast:100 speedy:80 slow:50 … Shipping • fast:120 speedy:85 slow:70 … Delivery • honest:80 reliable:60 … Seller Support = Cluster Size 10

  11. Method(2) Unstructured PLSA Source Modifier Head Term 1 fast shipping honest seller 2 fast shipping quick delivery reliable seller d1 d2 w dk Topic model = unigram language model = multinomial distribution [Hofmann 99] shiping 0.3 delivery 0.2 1 email 0.25comm. 0.22 2 … service 0.32exchange 0.2 k 11

  12. Method(2) Unstructured PLSA Source Modifier Head Term 1 fast shipping honest seller 2 fast shipping quick delivery reliable seller ? ? d1 ? ? d2 w ? ? dk Topic model = unigram language model = multinomial distribution [Hofmann 99] shiping delivery 1 Estimation: e.g. EM with MLE email comm. 2 … service exchange k 12

  13. Method(3) Structured PLSA Source Modifier Head Term Modifier Head Term 1 fast Shipping fast shipping:180 honest Seller delivery: 80 2 fast delivery slow shipping: 70 quick delivery delivery: 30 reliable seller response: 10 d1 d2 w dk ? shiping delivery ? 1 email comm. ? 2 ? … ? ? service exchange k 13

  14. Method(2) (3): Topics  Aspects d1 d2 w dk Aspects Topics shiping 0.3 delivery 0.2 1 email 0.25comm. 0.22 2 … service 0.32exchange 0.2 k Support = Topic Coverage 14

  15. Method(2) (3): Adding Prior to PLSA d1 d2 w dk Dirichlet Prior Topics shiping ? delivery ? shiping delivery a1 1 email ?comm. ? email comm. 2 a2 … service ?exchange ? k Estimation: e.g. EM with Maximum A Posteriori (MAP) instead of MLE 15

  16. Step2: Aspect Rating Prediction Step2: Aspect Rating Prediction Step3: Extract Representative Phrases Step1: Aspect Discovery and Clustering 16 16

  17. Method(1) Local Prediction Source Modifier Head Term 1 fast shipping great product 2 slow delivery poorly packaged fine product … … … What if? Aspects slow Shipping Product Shipping Packaging Product 17

  18. Method(2) Global Prediction Shipping Shipping fast 0.2 timely 0.2 quick 0.2… … slow 0.01 fast , timely, quick, fast, slow, quickly, fast, great, bad Aspects Source Modifier Head Term 1 fast shipping Shipping great product Product Shipping 2 slow delivery Packging poorly Packaged Shipping Shipping fine product Product slow 0.4 bad 0.2… …quick 0.02fast 0.01 slow , bad, fast, poor, slowly, unbearable, quick, poor … … … What if? slow shipping Language Model 18

  19. Method(1)(2): Rating Aggregation Aspect Aspect Rating quick shipping AVG Fast delivery 2.33 stars Shipping slow shipping well packaged AVG poor packaging 1.67 stars Packaging badly wrapped 19

  20. Step3: Representative Phrases Step2: Aspect Rating Prediction Step3: Extract Representative Phrases Step1: Aspect Discovery and Clustering 20 20

  21. Step3: Top K Frequent Phrases Step 1 Step 2 Step 3 Fast shipping Timely delivery Quickly arrived quick shipping Fast delivery Shipping slow delivery bad shipping Slow shipment Bad shipping Slow delivery (50)‏ Support = Phrase Freq. 21

  22. Experiments: eBay Data Set Statistics Mean STD # of comments/seller 57,055 62,395 # of phrases/comment 1.5533 0.0442 overall rating (positive %)‏ 97.9 0.95 28 eBay sellers with high feedback scores for the past year Positive  rating 1 Neutral  rating 0 Negative rating 0 22

  23. Experiments: Evaluate Step 1 Step1: Aspect Discovery & Clustering Gold standard: human labeled clusters Questions: • Is phrase structure useful? • Is topic modeling effective? 23

  24. Eval Step 1: Aspect Coverage Aspect Coverage measures the percentage of covered aspects Unstructured PLSA Structured PLSA k-means Aspect Coverage 24 Top K Clusters

  25. Eval Step 1: Clustering Accuracy Method Clustering Accuracy K-means 0.36 Unstructured PLSA 0.32 Structured PLSA 0.52 Seller1 Seller2 Seller3 AVG Annot1-2 0.6610 0.5484 0.6515 0.6203 Annot1-3 0.7846 0.6806 0.7143 0.7265 Annot2-3 0.7414 0.6667 0.6154 0.6745 AVG 0.7290 0.6319 0.6604 0.6738 Clustering Accuracy measures the cluster coherence Still much room for improvement! Human Agreement Low Agreement; Varies a lot 25

  26. Experiments: Evaluate Step 2 Step2: Aspect Rating Prediction Questions: • Local prediction v.s. Global prediction? • How does aspect clustering affect this? 26

  27. Detailed Seller Ratings as Gold std Gold standard: user DSR ratings DSR criteria as priors of aspects 27

  28. Eval Step 2: Correlation Step 1 Step 2 Kendal’s tau Pearson Baseline 0.2892 0.3162 K-means Local 0.1106 (-62%)‏ 0.1735 (-45%)‏ K-means Global 0.1225 (-58%)‏ -0.0250 (-108%)‏ Unstr. PLSA Local 0.2815 0.4158 Unstr. PLSA Global 0.4958 (+76%)‏ 0.5781 (+39%)‏ Str. PLSA Local 0.1905 0.4517 Str. PLSA Global 0.4167 (+119%)‏ 0.6118 (+35%)‏ Correlationmeasures the effectiveness of ranking the four DSRs for a given seller 28

  29. Eval Step 2: Ranking Loss Step 1 Step 2 AVG of 3 DSR Baseline 0.2363 K-means Local 0.2170 (-8%)‏ K-means Global 0.6307 (+167%)‏ Unstr. PLSA Local 0.1977 (-16%)‏ Unstr. PLSA Global 0.2101(-11%)‏ Str. PLSA Local 0.1909 (-19%)‏ Str. PLSA Global 0.1534 (-35%)‏ Ranking Loss measures the distance between the true and predicted ratings (smallerbetter)‏ Local Pred: more robust Global Pred: more accurate 29

  30. Experiments: Evaluate Step 3 Step3: Representative Phrases Questions: • How do previous steps affect the phrase quality? 30

  31. Eval Step 3: Human Labeling DSR Rating 1 Rating 0 Item as Described Communication Shipping time Shipping and Handling Charges Rating 1: Rating 0: Fast delivery Prompt email Slow shipping … Excessive postage As promised … 31

  32. Eval Step 3: Measures & Results Step 1 Step 2 Prec. Recall K-means Local 0.3055 0.3510 K-means Global 0.2635 0.2923 Unstr. PLSA Local 0.4127 0.4605 Unstr. PLSA Global 0.4008 0.4435 Str. PLSA Local 0.5925 0.6379 Str. PLSA Global 0.5611 0.5952 Information Retrieval measures: Human generated phrases  “relevant document“ Computer generated phrases  “retrieved document". 32

  33. Summary • Novel problem • Rated Aspect Summarization • General Methods • Three steps • Effective on eBay Feedback Comments • Future Work • Evaluate on other data • Three steps  One optimization framework 33

  34. Thank you! 34

More Related