490 likes | 503 Views
CS533 Information Retrieval. Dr. Michal Cutler Lecture #11 Feb 29, 2000. This Class. Inference networks conditional independence belief nets Inference nets in Information Retrieval Evaluation. Conditional Independence.
E N D
CS533 Information Retrieval Dr. Michal Cutler Lecture #11 Feb 29, 2000 Cutler cs533
This Class • Inference networks • conditional independence • belief nets • Inference nets in Information Retrieval • Evaluation Cutler cs533
Conditional Independence • Variable V is conditionally independent of a set of variables V1 given another set of variables V2 if: P(V|V1,V2)=P(V|V2) • Intuition: V1 gives us no more information about V than we already know because of knowing V2. Cutler cs533
Conditional Independence • Our belief that a person will have a Steady job, given evidence on Education and Job, is independent of whether or not the person owns a Home • P(Steady|Ed, Job, Home) = P(Steady|Education , Job) Cutler cs533
Belief networks A belief network is a graph in which: 1. A set of random variables make up the nodes of the network 2. A set of directed edges connect pairs of nodes. The intuitive meaning of an arrow from node X to node Y is that X has direct influence on Y Cutler cs533
Belief networks 3. Each node has a link matrix (conditional probability table) that quantifies the effects the parents have on the node 4. The graph has no directed cycles Cutler cs533
Example:Getting a loan • Education and a Job influence having a Steady job • Family influences having a Guarantor for a loan • Steady job and Guarantor influence getting a Loan Cutler cs533
P(F) .7 P(J) .8 E J P(S) T T .9 T F .6 F T .7 F F .1 S G P(L) T T 1 T F 0 F T 0 F F 0 F P(G) T .7 F .2 Getting a loan P(E) .6 E J F S G L Cutler cs533
Computing the conditional probabilities • The conditional probabilities can be computed from experience (past records of loan applications) • Or, use formulas for computing the conditional probabilities • For example the link matrix for Loan is of type LAND. Cutler cs533
Semantics of belief networks • To construct a net, think of representing the joint probability distribution. • Every entry in the joint probability distribution can be computed from the information in the network Cutler cs533
Probability of getting a loan • We can use the belief network to compute the probability .39 of getting a loan (next foil) • If we have a person with a job, education, and home the probability of getting a loan is higher. P(L=true|E,J,H)=P(L=true|S,G)* P(S|E,J)*P(G|H)=1*.9*.7=.63 Cutler cs533
Inference Networks for IR • Turtle and Croft introduced the inference network model for information retrieval • This is a probability-based method • Ranks documents by probability of satisfying a user's information need. Cutler cs533
Document / concept/ query network Document network Concept network Query network Cutler cs533
Each document can have many representations The relationships between concepts generate the concept network The Document/Concept Network Cutler cs533
The information need may be based on complex interactions between various sources of evidence and different representations of the user's need. The Query Network Cutler cs533
d2 d3 di d1 r3 r1 r2 rj q1 qk I An inference network Cutler cs533
Inference Networks • There are four kinds of nodes in this network. Cutler cs533
Inference Networks • The di nodes represent particular documents and correspond to the event of observing that document. • The rj nodes are concept representation nodes. These correspond to the concepts that describe the contents of a document. Cutler cs533
Inference Networks • The qk nodes are query nodes. • They correspond to the concepts used to represent the information need of the user. • The single leaf node I corresponds to the (unknown) information need. Cutler cs533
Inference Networks • To evaluate a particular document, the single corresponding node is instantiated and the resulting probabilities are propagated through the network to derive a probability associated with the I node. • To generate a ranking for all documents in the collection, this occurs for each of the nodes in the network. Cutler cs533
Inference Networks • Each di node is instantiated only once and no other nodes are active at the same time. • The probabilities associated with child nodes are based on the probabilities of their parents and on a ``link matrix'' that specifies how to combine the evidence from the parents. Cutler cs533
Inference Networks • The ``link matrices'' between the di nodes and the rj nodes represent the evidence for the proposition that this concept occurs in this document. • The link matrices between the rj nodes and the qk nodes specify how the representation concepts are combined to form the final probability. Cutler cs533
Inference Networks • The di and rj nodes are static for a given static collection and are constructed independently of any particular query. • The qk and I portions of the network are constructed individually for each query. Cutler cs533
Example • D1: Searching the web • D2: Spiders for searching the web • D3: Tools for searching files • After indexing the terms are: file, search, spider, tool, web • Assume a query “Search tools” Cutler cs533
Computing df/idf • df(file)=1, idf(file)=lg(3/1)+1=2.58 • df(search)=3, idf(search)=lg(3/3)+1=1 • df(spider)=1, idf(spidera)=lg(3/1)=2.58 • df(tool)=1, idf(tool)=2.58 • df(web)=2, idf(web)=lg(3/2)+1=1.58 Cutler cs533
The term/document matrix ntf Cutler cs533
S T P(Q) T T .9 T F 0.7 F T 0.4 F F 0.1 The inference net D1 D2 D3 Search Tools Q Cutler cs533
Link matrices • The link matrix for the query terms is supplied by the user • If the query was Boolean and had AND, OR, and NOT nodes link matrices for each could be used • Concept probability is: P(rj|Di) = 0.5+0.5*ntfji*nidfJ Cutler cs533
Link matrix for “search” P(search|D1)=0.5+0.5*ntf*nidf=0.5+0.5*1*0.38=0.69 Cutler cs533
Link matrix for “tools” Cutler cs533
P(Q|D1) • P(search=true|D1)=0.69 • P(tools|D1)=0 • P(Q|D1)=P(Q|~S,~T)P(~S)P(~T)+ +P(Q|~S,T)P(~S)P(T)+ +P(Q|S,~T)P(S)P(~T)+ +P(Q|S,T)P(S)P(T)= .1*.31*1 + .4*.31*0 + .7*0.69*1 +.9*0.69*0=0.514 Cutler cs533
P(Q|D3) • P(search=true|D3)=0.69 • P(tools|D3)=1 • P(Q|D1)=P(Q|~S,~T)P(~S)P(~T)+ +P(Q|~S,T)P(~S)P(T)+ +P(Q|S,~T)P(S)P(~T)+ +P(Q|S,T)P(S)P(T)= .1*.31*0 + .4*.31*1 + .7*0.69*0 +.9*0.69*1=0.745 Cutler cs533
Evaluation • Fallout • Relevance judgements • 11 point recall/precision • Average recall/precision Cutler cs533
Problems with Recall & precision • Recall • undefined when there are no relevant documents • Precision • undefined when no documents are retrieved Cutler cs533
Fallout • Fallout= • A good system should have high recall and low fallout Cutler cs533
Relevance judgment • Exhaustive? • Assume 100 queries • 750,000 documents in collection • Requires 75 million relevance judgments Cutler cs533
Relevance judgment • Sampling? • with 100 queries, average 200 and maximum 900 relevant docs per query, and 750,000 documents, the size of the sample needed for good results is still too large Cutler cs533
Relevance judgment • Polling used in TREC • It is assumed that all relevant documents have been retrieved by one of the competitors • 200 top documents of 33 runs, an average of 2398 docs per topic Cutler cs533
11 point recall/ precision • 11 point (sometimes 20 point) average precision is used to compare systems. • An alternative is to compute recall and precision at N1, N2, N3,…, documents retrieved • Assume 6 relevant docs in collection • 200 documents in collection Cutler cs533
Recall & precision Cutler cs533
Recall & precision Cutler cs533
Precision 1.0 Relevant nonrelevant 1 2 4 0.75 7 3 10 0.57 5 6 0.5 13 0.46 8 9 11 12 200 0.0 1.0 0.5 0.6 0.8 Recall Cutler cs533
Interpolated values • The interpolated precision is the maximum precision at this and all higher recall levels Cutler cs533
Precision Interpolated Values 1.0 1 2 4 0.75 7 3 10 0.57 5 6 0.5 0.46 8 9 11 12 200 0.0 1.0 0.5 0.6 0.8 Recall Cutler cs533
Averaging performance • Average recall/precision for a set of queries is either user or system oriented • User oriented • Obtain the recall/precision values for each query and • then average over all queries Cutler cs533
Averaging performance • System oriented - use the following totals for all queries: • relevant documents, • relevant retrieved, • total retrieved • User oriented is commonly used Cutler cs533
User oriented recall-level average • Average at each recall level after interpolation Cutler cs533
Precision Query 1 1.0 Query 2 0.75 0.57 0.5 0.46 0.0 1.0 0.5 0.6 0.8 Recall Cutler cs533