1 / 24

A Unified Relevance Model for Opinion Retrieval

A Unified Relevance Model for Opinion Retrieval. (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu. Outline. Introduction Formal Framework for Opinion Retrieval Sentiment expansion techniques Test Collections and Sentiment Resources Experiments Conclusion.

margo
Download Presentation

A Unified Relevance Model for Opinion Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu

  2. Outline • Introduction • Formal Framework for Opinion Retrieval • Sentiment expansion techniques • Test Collections and Sentiment Resources • Experiments • Conclusion

  3. Introduction • Opinion retrieval is very different to traditional topic-based retrieval. • Relevant documents should not only be relevant to the targets, but also contain subjective opinions about them. • The text collections are more informal “word of mouth” web data.

  4. the greatest challenge : the difficulty in representing the user’s information need. • typical queries : content words, some cue words. • the majority of previous work in opinion retrieval • documents are ranked by topical relevance only • candidate relevant documents are re-ranked by their opinion scores

  5. The novelty and effectiveness of the approach: • topic retrieval and sentiment classification can be converted to a unified opinion retrieval procedure through query expansion; • not only suitable for English data, but also to be effective for a Chinese benchmark collection; • extended to other tasks where user queries are inadequate to express the information need, • geographical information retrieval and music retrieval.

  6. Formal Framework for Opinion Retrieval • Suppose we can obtain an opinion relevance model from a query. • R : the opinion relevance model for a query • D : the document model • V : the vocabulary

  7. Content words and opinion words contribute differently to opinion retrieval. • Topical relevance is determined by the matching of content words and the user query • The sentiment and subjectivity of a document is decided by the opinion words. • We divide the vocabulary V into two disjoint subsets: • a content word vocabulary CV • an opinion word vocabulary OV

  8. An opinion relevance model is a unified model of both topic relevance and sentiment. In this model, Score(D), the score of a document is defined as: CV : the set of original query terms, or obtained by any query expansion technique OV : only opinion words are used to expand an original query instead of content words.

  9. Sentiment expansion techniques • Query-independent sentiment expansion • Query-dependent sentiment expansion • Mixture relevance model

  10. *Query-independent sentiment expansion • Sentiment expansion based on seed words • positive seeds, negative seeds • advantage: nearly always express strong sentiment • disadvantage: some of them are infrequent in the text collections

  11. Sentiment expansion based on text corpora • only the most frequent opinion words are selected to expand the original queries • occurrences in a general corpus may be misleading • reliable opinionated corpus: Cornell movie review datasets and MPQA Corpus

  12. Sentiment expansion based on relevance data • The goal is to automatically learn a function from training data to rank documents • Given a set of query relevance judgments, we can define the individual contribution to opinion retrieval for an opinion word • w : an opinion word, used to expand every original query. • contribution of w : the maximum increase in the MAP of the expanded queries over a set of original queries

  13. Qi : the i-th query in the training set • Qi ∪w: the i-th expanded query • weight(w) : the weight of w, • AP(Qi): the average precision of the retrieved documents for Qi • AP(Qi ∪ w,weight(w)) :the average precision of the retrieved documents for the expanded query while the weight of w is set as weight(w)

  14. *Query-dependent sentiment expansion • A target is always associated with some particular opinion words. • e.g. Mozart: Austrian musician (genius , famous) • extract opinion words from a set of user-provided relevant opinionated documents • when there are no user-provided relevant documents. • the relevant document set C can be acquired through pseudo-relevance feedback. • rank documents using query likelihood scores • select some top ranked documents to get the pseudo relevant set of C.

  15. *Mixture relevance model • query-independent and query-dependent. • query-independent: the most valuable opinion words are always general words and can be used to express opinions about any target. • query-dependent: those words most likely to co-occur with the terms in the original query are used for expansion. • These words are used to express opinions about particular targets.

  16. the interpolation of the scores assigned by original query, query-independent sentiment expansion, and query-dependent sentiment expansion

  17. Test Collections and Sentiment Resources • Both evaluations aim at locating documents that express an opinion about a given target.

  18. *Sentiment resources • English resources • General Inquirer : 3,600 opinion words • OpinionFinder’s Subjectivity Lexicon : >5,600

  19. Chinese resources • HowNet Sentiment Vocabulary: 7000

  20. Experiments

  21. If retrieval effectiveness is preferred, mixture approaches should be adopted • If retrieval efficiency is preferred, query independent sentiment expansion should be adopted.

  22. Conclusion • we have proposed the opinion relevance model, a formal framework for directly modeling the information need for opinion retrieval. • query terms are expanded with a small number of opinion words to represent the information need. • We propose a series of sentiment expansion approaches to find the most appropriate query-independent or query-dependent opinion words.

More Related