Relevance Models In Information Retrieval

Relevance Models In Information Retrieval Victor Lavrenko and W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science University of Massachusetts Amherst Presenter : Chia-Hao Lee

Outline • Introduction • Related Work • Relevance Models • Estimating a Relevance Model • Experimental Results • Conclusion NTNU Speech Lab

Introduction • The field of information retrieval has been primarily concerned with developing algorithms to identify relevant pieces of information in response to a user’s information need. • The notion of relevance is central to information retrieval, and much research in area has forced on developing formal models of relevance. NTNU Speech Lab

Introduction (cont.) • One of the most popular models, introduced by Robertson and Sparck Jones, ranks documents by their likelihood of belonging to the relevant class of documents for a query. • More recently, in the language modeling approach, the advantage of this approach is to shift from developing heuristic weights for representing term importance to instead focus on estimation techniques for the document model. NTNU Speech Lab

Related Work • Classical Probabilistic Approach Underlying most research on probabilistic models of information retrieval is probability ranking principle, advocated by Robertson in, which suggests ranking the documents D by the odds of their being observed in the relevant class: . NTNU Speech Lab

Related Work (cont.) • Language Modeling Approach Most of these approaches rank he documents in the collection by the probability that a query Q would be observed during repeated random sampling form the model of document D : NTNU Speech Lab

Related Work (cont.) • Cross-Language Approach Language-modeling approaches have been extended to cross-language retrieval by Hiemstra and de Jong and Xu et al. The model proposed by Berger and Lafferty applies to the “translation” of a document into a query in a monolingual environment, but it can readily accommodate a bilingual environment. NTNU Speech Lab

Relevance Models • Define some parameter: • V : a vocabulary in some language • C : some large collection of documents • R : the subset of documents in C ( ) • :a relevance model to be the probability distribution NTNU Speech Lab

Relevance Models (cont.) • The primary goal of Information Retrieval systems is to identify a set of documents relevant to some query Q . • Unigram language models ignore any short-range interactions between the words in a sample of text, so we cannot distinguish between grammatical and non-grammatical samples of text. • The attempts to use higher-order models were few and did not lead to noticeable improvements. NTNU Speech Lab

Relevance Models (cont.) • Two approaches to document ranking the probability ratio, advocated by the classical probabilistic models , and cross-entropy. NTNU Speech Lab

Relevance Models (cont.) • Classical probabilistic models The Probability Ranking Principlesuggests that we should rank the documents . In order of decreasing probability ratio: If we assume a document Dto be a sequence independent words: , the probability ranking principle may be expressed as a product of the ratios: NTNU Speech Lab

like idf Relevance Models (cont.) • Cross-entropy Let denote the language model of the relevant class, and for every document D let denote the corresponding document language model. Cross-entropy is a natural measure of divergence between two language models, defined as: NTNU Speech Lab

Relevance Models (cont.) • Intuitively, documents with small cross-entropy from the relevance model are likely to be relevant, so we rank the documents by increasing cross-entropy. • So, we can know that cross-entropy enjoys a number of attractive theoretical properties. • One property is of particular importance: suppose we estimate as the relative frequency of the word w in the user query Q. NTNU Speech Lab

Estimating a Relevance Model • We discuss a set of techniques that could be used to estimate the set of probabilities . • Estimation from a set of examples. • Estimation without example. • Cross-lingual estimation. NTNU Speech Lab

Estimating a Relevance Model (cont.) • Estimation from a set of Examples: Let denote the probability of randomly picking document Dfrom the relevant set R. We assume each relevant document is equally likely to be picked at random, so the estimate is : The probability of observing a word w if we randomly pick some word from D is simply the relative frequency of w in D: NTNU Speech Lab

Estimating a Relevance Model (cont.) • Combining the estimates from the above two equations, the probability of randomly picking a document Dand then observing the word w is : • the overall probability of observing the word w in the relevant class : NTNU Speech Lab

Estimating a Relevance Model (cont.) • Now suppose we have a sufficiently large, but incomplete subset of examples , and would like to estimate the relevance model . • Indeed, the resulting estimator has a number of interesting probabilities: • is an unbiased estimator of for a random subset . • is the maximum-likelihood estimator with respect to the set of examples S . • is the maximum-entropy probability distribution constrained by S. NTNU Speech Lab

Estimating a Relevance Model (cont.) • Most smoothing methods center around a fairly simple idea: :is a parameter that controls the degree of smoothing. • This connection allows us to interpret smoothing as a way of selecting a different prior distribution . . NTNU Speech Lab

Figure 2.2. Queries and relevant documents are random samples from an underlying relevance model R. Note: the sampling process could be different for queries and documents. Estimating a Relevance Model (cont.) • Estimation without Examples Estimation of relevance models when no training examples are available. As a running example we will consider the task of ad-hoc information retrieval, where we are given a short 2-3 word query, indicative of the user’s information need and no examples of relevant documents. NTNU Speech Lab

Estimating a Relevance Model (cont.) Our best bet is to relate the probability of w to the conditional probability of observing w given that we just observed : We are “translating” a set of words into a single word. NTNU Speech Lab

Estimating a Relevance Model (cont.) • Method 1 : i.i.d. sampling Assume that the query words and the words w in relevant document are sampled identically and independently from a unigram distribution . We assumed that w and all are sampled independently and identically to each other : NTNU Speech Lab

Estimating a Relevance Model (cont.) Combination: NTNU Speech Lab

Estimating a Relevance Model (cont.) • Method 2 : conditional sampling We fix a value of w according to some prior . Then perform the following process k times : pick a distribution according to , the sample the query word from with probability . The effect of this sampling strategy is that we assume the query words to be independent of each other. NTNU Speech Lab

Estimating a Relevance Model (cont.) we compute the expectation over the universe C of our unigram models. Combination: NTNU Speech Lab

Figure 2.3. Dependence networks for two methods of estimating the joint probability Estimating a Relevance Model (cont.) Figure NTNU Speech Lab

Estimating a Relevance Model (cont.) • Cross-lingual Estimation Goal: estimate the relevance model in some target language, different from the language of the query. Let be the query in the source language and let be the unknown set of target documents that are relevant to that query. NTNU Speech Lab

Estimating a Relevance Model (cont.) • For example: the probability distribution for every word t in the vocabulary of the target language. • An implicit assumption behind equation (2.22) is that there exists a joint probabilistic model from which we can compute the joint probability . • That and trepresent words from different languages, and so will not naturally occur in the same documents. NTNU Speech Lab

Estimating a Relevance Model (cont.) • 1.Estimation with a parallel corpus Suppose we have at our disposal a parallel corpus C, a collection pf document pairs , where is am document in the source language, and is a document in the target language discussing the same topic as . Method 1 : Method 2: NTNU Speech Lab

Estimating a Relevance Model (cont.) • 2.Estimation with a statistical lexicon A statistical lexicon is special dictionary which gives the translation probability for every source word and every target word t. In this case we let Cbe the set of target documents . In order to compute for a source word in a target document . NTNU Speech Lab

Experiment • Experiment Setup English Resources NTNU Speech Lab

Experiment (cont.) Chinese Resources NTNU Speech Lab

Experiment (cont.) • Ad-hoc Retrieval Experiments NTNU Speech Lab

Experiment (cont.) • Comparison of Ranking Methods NTNU Speech Lab

Experiment (cont.) NTNU Speech Lab

Experiment (cont.) • Relevance Feedback NTNU Speech Lab

Experiment (cont.) • Cross-Lingual Experiments NTNU Speech Lab

Conclusions • In this work we introduced a formal framework for modeling the notion of relevance in Information Retrieval. • And we defined a relevance model to be the language model that reflects word frequencies in the class of documents that are relevant to some given information need. NTNU Speech Lab

Relevance Models In Information Retrieval

Relevance Models In Information Retrieval

Presentation Transcript

Language Models for Information Retrieval

: Retrieval Models

Cumulative Progress in Language Models for Information Retrieval

Information Retrieval – Language models for IR

Relevance Models [draft]

Advanced Information- Retrieval Models

Information Retrieval Models

Probabilistic Models in Information Retrieval SI650: Information Retrieval

Language and Document Models in Information Retrieval

Information Retrieval Models

Retrieval Models

Information Retrieval: Models and Methods

Discriminative Models for Information Retrieval

RELEVANCE in information science

RELEVANCE in information science