290 likes | 385 Views
Adaptive Subjective Triggers for Opinionated Document Retrieval. Kazuhiro Seki Organization of Advanced Science & Technology Kobe University Kuniaki Uehara Graduate School of Engineering, Kobe University 2 /10/2009. Background. Increasing user-generated contents (UGC) on the web
E N D
Adaptive Subjective Triggers for Opinionated Document Retrieval Kazuhiro Seki Organization of Advanced Science & Technology Kobe University Kuniaki Uehara Graduate School of Engineering, Kobe University 2/10/2009
Background • Increasing user-generated contents (UGC) on the web • often contain personal subjective opinions • Can be helpful for personal/corporate decision making → demands to retrieve personal opinions for a given entity • Traditional IR aims to find documents relevant to a given topic (entity) • not concerned with subjectivity • Aim: Retrieve documents not only pertinent to a given entity but also containing subjective opinions
An (existing) approach • Lexicon-based (Mishne, 2006; Zhang et al., 2008;etc.) • Look for subjective words/phrases • “like” conveys favorable feelings • “I like the movie.” • Potential drawback • Only words/phrases separate from context do not indicate subjectivity • “It looks like a cat.” • “She likes singing.”
Another approach considering wider context • n-gram language model • estimate word occurrence probabilities based on prior context or history, i.e., (n– 1) words • bigram: P(wi|wi–1) • trigram:P(wi|wi–2,wi–1) • Generally, nis set to2to 3
Trigger models (Lau et al., 1993) • Incorporate long distance dependency that cannot be handled by n-gram models • Trigger pairs • word pairs such that one tends to bring about the occurrence of the other • nor → either(syntactic dependency) • memory → GB(semantic dependency) • Used by linearly interpolating with an n-gram model (1–λ)·PB(w|h) + λ·PT(w|h) n-gram model trigger model
Identifying trigger pairs (Tillmannet al.,1996) When P(b|h) < t → low level triggers each pair a→ b potential trigger pairs trigger model PT(w|h) vocabulary corpus start n-gram model P(w|h) extended model PE(w|h) log likelihood difference Δa→b=∑i {logPE(wi|hi) – logP(wi|hi)} evaluation
Building trigger modelPT • For each identified trigger pair (a→b), compute their association score α(b|a) based on their co-occurrences • Define a trigger model PT by usingα(·) average association score betweenwords in history h and word w
Subjective trigger model • Assumptions • Personal subjective opinion consists of two main components • Subject of the opinion (e.g, “I”, “you”) or the object the opinion is about (e.g., “The Curious Case of Benjamin Button”) • Subjective expression (e.g., “like”, “feel”) • Treat them as triggering and triggered words, respectively • Triggering words are expressed as pronouns • Empirical finding • Proximity of pronouns and subjective expressions to objects is an effective measure of opinionatedness(Zhou et al., 2007; Yang et al., 2007)
Identifying “subjective” trigger pairs • Pronouns considered • I, my, you, it, its, he, his, she, her, we, our, they, their, this • Historyh: preceding words in the same sentence • Corpus: 5000 customer reviews from Amazon.com
Identifying “subjective” trigger pairs (cont.) • Low level trigger (P(w|h)<t) causes the problem • Penalize frequent w with infrequent history h
Opinion retrieval • Probability that dis relevant toqANDsubjective • product of PINM(q|d)andPE(d)=∏iPE(wi|hi) • PE(d)is smaller for longer d • PINM(q|d)andPE(d)may have largely different variances • Normalize PE(d)bylength m& take weighted sum of logs queryq IR by INM PINM(q|d) documents d subj. language model PE(w|h) reranking documents d
Dynamic model adaptation • Motivation • Language models created from Amazon reviews may not be effective for some types of entities • Procedure • Carry out keyword search for a given topic • Usek top rankedblog posts to identify new trigger pairs (a→b) and computeα’(·) • Update trigger model by using the new trigger pairs association scores for new triggers
Empirical evaluation • Data • TRECBlog track test collection 2006 • 3 million blog posts crawled from Dec 2005 to Feb 2006 • 50 “topics” (user information needs) • Relevant & opinionated posts are explicitly labeled • Two types of assessment • Evaluation of the language models • Their effects on opinion retrieval
Evaluation of language models • Perplexity • Uncertainty of language model L in predicting word sequence (d = w1,…,wm) • Created two hypothetical documents from the Blog track collection • concatenate all the opinionated posts → dO • all the relevant (but non-opinionated) posts → dN
Perplexity Results • Higher order n-grams monotonically decrease perplexity irrespective of language models and document types • Opinionated document dO leads to lower perplexity • Subjective language model PEproduces lower perplexity than n-gram modelPB
Analysis on individual topics • Topics with notable improvement • “MacBook Pro”. Laptop (+0.22) • “Heineken”. Company and brand names (+0.20) • “Shimano”. Company and brand names (+0.19) • “Boardchess”. Board game (+0.13) • “Zyrtec”. Medication (product name)(+0.12) • “Mardi Gras”. Final day of carnival(+0.11) • Most of them are products • Model learned from Amazon reviews is effective for products in general, including beer and medication • Also effective for other types of entities
Analysis on individual topics (cont.) • Topics with performance decline • “Jim Moran”. Congressman (–0.15) • “World Trade Org.”. International organization(–0.05) • “Cindy Sheehan”. Anti-war activist (–0.03) • “AnnCoulter”. Political commentator (–0.01) • “West Wing”. TV drama set in the white house (–0.01) • “Sonic food industry”. Fast-food restaurant chain (–0.01) • Politics and organizations are difficult to improve? • BruceBartlett(+0.07), Jihad (+0.06), McDonalds(+0.03), Qualcomm(+0.02)
Results for dynamic model adaptation • Moderately improved performance • For “Zyrtec”, AP improved by 47.7%
Results for model adaptation for difficult topics • For most topics, AP slightly but consistently improved
Conclusions • Proposed subjective trigger models reflecting subjective opinions • Two assumptions + a modification to low-level triggers • Combined with an IR model for opinion retrieval • 22.0% improvement over INM in MAP • Effective for most topics, slight drop for topics concerning politics and organizations • Dynamic model adaptation • Positive effect overall (+25.0% over initial search) • Moderately effective for politics- and organization-related topics
Future work • Use of a larger corpus of customer reviews • Use of labeled data in the blog track test collection • Refine the approach to model adaptation
References Mishne, G.: Multiple Ranking Strategies for Opinion Retrieval in Blogs, Proceesings of the 15th Text Retrieval Conference (2006). Zhang, M. and Ye, X.: A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp.411.418 (2008). Lau, R., Rosenfeld, R. and Roukos, S.: Trigger-based language models: a maximum entropy approach, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol.2, pp.45.48 (1993). Tillmann, C. and Ney, H.: Grammatical Interference: Learning Syntax from Sentences, Lecture Notes in Computer Science, chapter Selection criteria for word trigger pairs in language modeling, pp.95.106, Springer Berlin / Heidelberg (1996). Zhou, G., Joshi, H. and Bayrak, C.: Topic Categorization for Relevancy and Opinion Detection, Proceedings of the 16th Text Retrieval Conference (2007). Yang, K., Yu, N. and Zhang, H.: WIDIT in TREC 2007 Blog Track: Combining Lexicon- Based Methods to Detect Opinionated Blogs, Proceedings of the 16th Text Retrieval Conference (2007). Zhang, W., Yu, C. and Meng, W.: Opinion retrieval from blogs, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pp. 831.840 (2007).