Dependence Language Model for Information Retrieval

Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval, SIGIR 2004

Reference • Structure and performance of a dependency language model. Ciprian, David Engle and et al. Eurospeech 1997. • Parsing English with a Link Grammar. Daniel D. K. Sleator and Davy Temperley. Technical Report CMU-CS-91-196 1991.

Why we use independence assumption? • The independence assumption is one of the assumptions widely adopted in probabilistic retrieval theory. • Why? • Make retrieval models easier. • Make retrieval operation tractable. • The shortage of independence assumption • Independence assumption does not hold in textual data.

Latest ideas of dependence assumption • Bigram • Some language modeling approach try to incorporate word frequency by using bigram. • Shortage: • Some of word dependencies not only exist between adjacent words but also exist at more distant. • Some of adjacent words are not exactly connected. • Bigam language model showed only marginally better effectiveness than the unigram model. • Bi-term • Bi-term language model is similar to the bigram model except the constraint of order in terms is relaxed. • “information retrieval” and “retrieval of information” will be assigned the same probability of generating the query.

Structure and performance of a dependency language model

Introduction • This paper present a maximal entropy language model that incorporates both syntax and semantics via a dependency grammar. • Dependency grammar: express the relations between words by a directed graph which can incorporate the predictive power of words that lie outside of bigram or trigram range.

Introduction • Why we use Ngram • Assume if we want to record we need to store independent parameters • The drawback of Ngram • Ngram blindly discards relevant words that lie N or more positions in the past.

Structure of the model

Structure of the model • Develop an expression for the joint probability , K is the linkages in the sentence. • Then we get • Assume that the sum is dominated by a single term, then

A dependency language model of IR • A query we want to rank • Previous work: • Assume independence between query terms : • New work: • Assume that term dependencies in a query form a linkage

A dependency language model of IR • Assume that the sum over all the possible Ls is dominated by a single term • Assume that each term is dependent on exactly one related query term generated previous.

A dependency language model of IR

A dependency language model of IR • Assume • The generation of a single term is independent of L • By this assumption, we would have arrived at the same result by starting from any term. L can be represented as an undirected graph.

A dependency language model of IR 取log

Parameter Estimation • Estimating • Assume that the linkages are independent. • Then count the relative frequency of link l between and given that they appear in the same sentence. Have a link in a sentence in training data A score The link frequency of query i and query j

Parameter Estimation assumption Assumption:

Parameter Estimation • Estimating • The document language model is smoothed with a Dirichlet prior Constant discount Dirichilet distribution

Parameter Estimation • Estimating

Experimental Setting • Stemmed and stop words were removed. • Queries are TREC topics 202 to 250 on TREC disk 2 and 3.

The flow of the experimental document Training data For weight computation query Find the linkage of query Get Count the frequency Find the max L by maxlP(l|Q) Get P(L|D) Count the frequency Ranking document Get combine Count the frequency Get

Result-BM & UG • BM: binary independent retrieval • UG: unigram language model approach • UG achieves the performance similar to, or worse than, that of BM.

Result- DM • DM: dependency model • The improve of DM over UG is statistically significant.

Result- BG • BG: bigram language model • BG is slightly worse than DM in five out of six TREC collections but substantially outperforms UG in all collection.

Result- BT1 & BT2 • BT: bi-term language model

Conclusion • This paper introduce the linkage of a query as a hidden variable. • Generate each term in turn depending on other related terms according to the linkage. • This approach cover several language model approaches as special cases. • The experimental of this paper outperforms substantially over unigram, bigram and classical probabilistic retrieval model.

Dependence Language Model for Information Retrieval

Dependence Language Model for Information Retrieval

Presentation Transcript

Language Models for Information Retrieval

Natural Language Processing for Information Retrieval

Information Retrieval Model

Gravitation-Based Model for Information Retrieval

Cross-Language Information Retrieval

Information Retrieval – Language models for IR

Fuzzy Logic Information Retrieval Model

Cross-Language Information Retrieval

Probabilistic Language-Model Based Document Retrieval

Cross Language Information Retrieval (CLIR)

Two-stage Language Models for Information Retrieval

Using Social Annotations to Improve Language Model for Information Retrieval

Natural Language Processing for Information Retrieval

Cross Language Information Retrieval (CLIR)

Cross Language Information Retrieval (CLIR)

Cross Language Information Retrieval (CLIR)

Language Modeling Frameworks for Information Retrieval

Iterative Translation Disambiguation for Cross Language Information Retrieval

Probabilistic Language-Model Based Document Retrieval

Cross-Language Information Retrieval (CLIR)