1 / 49

CS533 Information Retrieval

CS533 Information Retrieval. Dr. Michal Cutler Lecture #11 Feb 29, 2000. This Class. Inference networks conditional independence belief nets Inference nets in Information Retrieval Evaluation. Conditional Independence.

wcarla
Download Presentation

CS533 Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS533 Information Retrieval Dr. Michal Cutler Lecture #11 Feb 29, 2000 Cutler cs533

  2. This Class • Inference networks • conditional independence • belief nets • Inference nets in Information Retrieval • Evaluation Cutler cs533

  3. Conditional Independence • Variable V is conditionally independent of a set of variables V1 given another set of variables V2 if: P(V|V1,V2)=P(V|V2) • Intuition: V1 gives us no more information about V than we already know because of knowing V2. Cutler cs533

  4. Conditional Independence • Our belief that a person will have a Steady job, given evidence on Education and Job, is independent of whether or not the person owns a Home • P(Steady|Ed, Job, Home) = P(Steady|Education , Job) Cutler cs533

  5. Belief networks A belief network is a graph in which: 1. A set of random variables make up the nodes of the network 2. A set of directed edges connect pairs of nodes. The intuitive meaning of an arrow from node X to node Y is that X has direct influence on Y Cutler cs533

  6. Belief networks 3. Each node has a link matrix (conditional probability table) that quantifies the effects the parents have on the node 4. The graph has no directed cycles Cutler cs533

  7. Example:Getting a loan • Education and a Job influence having a Steady job • Family influences having a Guarantor for a loan • Steady job and Guarantor influence getting a Loan Cutler cs533

  8. P(F) .7 P(J) .8 E J P(S) T T .9 T F .6 F T .7 F F .1 S G P(L) T T 1 T F 0 F T 0 F F 0 F P(G) T .7 F .2 Getting a loan P(E) .6 E J F S G L Cutler cs533

  9. Computing the conditional probabilities • The conditional probabilities can be computed from experience (past records of loan applications) • Or, use formulas for computing the conditional probabilities • For example the link matrix for Loan is of type LAND. Cutler cs533

  10. Semantics of belief networks • To construct a net, think of representing the joint probability distribution. • Every entry in the joint probability distribution can be computed from the information in the network Cutler cs533

  11. Probability of getting a loan • We can use the belief network to compute the probability .39 of getting a loan (next foil) • If we have a person with a job, education, and home the probability of getting a loan is higher. P(L=true|E,J,H)=P(L=true|S,G)* P(S|E,J)*P(G|H)=1*.9*.7=.63 Cutler cs533

  12. Cutler cs533

  13. Inference Networks for IR • Turtle and Croft introduced the inference network model for information retrieval • This is a probability-based method • Ranks documents by probability of satisfying a user's information need. Cutler cs533

  14. Document / concept/ query network Document network Concept network Query network Cutler cs533

  15. Each document can have many representations The relationships between concepts generate the concept network The Document/Concept Network Cutler cs533

  16. The information need may be based on complex interactions between various sources of evidence and different representations of the user's need. The Query Network Cutler cs533

  17. d2 d3 di d1 r3 r1 r2 rj q1 qk I An inference network Cutler cs533

  18. Inference Networks • There are four kinds of nodes in this network. Cutler cs533

  19. Inference Networks • The di nodes represent particular documents and correspond to the event of observing that document. • The rj nodes are concept representation nodes. These correspond to the concepts that describe the contents of a document. Cutler cs533

  20. Inference Networks • The qk nodes are query nodes. • They correspond to the concepts used to represent the information need of the user. • The single leaf node I corresponds to the (unknown) information need. Cutler cs533

  21. Inference Networks • To evaluate a particular document, the single corresponding node is instantiated and the resulting probabilities are propagated through the network to derive a probability associated with the I node. • To generate a ranking for all documents in the collection, this occurs for each of the nodes in the network. Cutler cs533

  22. Inference Networks • Each di node is instantiated only once and no other nodes are active at the same time. • The probabilities associated with child nodes are based on the probabilities of their parents and on a ``link matrix'' that specifies how to combine the evidence from the parents. Cutler cs533

  23. Inference Networks • The ``link matrices'' between the di nodes and the rj nodes represent the evidence for the proposition that this concept occurs in this document. • The link matrices between the rj nodes and the qk nodes specify how the representation concepts are combined to form the final probability. Cutler cs533

  24. Inference Networks • The di and rj nodes are static for a given static collection and are constructed independently of any particular query. • The qk and I portions of the network are constructed individually for each query. Cutler cs533

  25. Example • D1: Searching the web • D2: Spiders for searching the web • D3: Tools for searching files • After indexing the terms are: file, search, spider, tool, web • Assume a query “Search tools” Cutler cs533

  26. Computing df/idf • df(file)=1, idf(file)=lg(3/1)+1=2.58 • df(search)=3, idf(search)=lg(3/3)+1=1 • df(spider)=1, idf(spidera)=lg(3/1)=2.58 • df(tool)=1, idf(tool)=2.58 • df(web)=2, idf(web)=lg(3/2)+1=1.58 Cutler cs533

  27. The term/document matrix ntf Cutler cs533

  28. S T P(Q) T T .9 T F 0.7 F T 0.4 F F 0.1 The inference net D1 D2 D3 Search Tools Q Cutler cs533

  29. Link matrices • The link matrix for the query terms is supplied by the user • If the query was Boolean and had AND, OR, and NOT nodes link matrices for each could be used • Concept probability is: P(rj|Di) = 0.5+0.5*ntfji*nidfJ Cutler cs533

  30. Link matrix for “search” P(search|D1)=0.5+0.5*ntf*nidf=0.5+0.5*1*0.38=0.69 Cutler cs533

  31. Link matrix for “tools” Cutler cs533

  32. P(Q|D1) • P(search=true|D1)=0.69 • P(tools|D1)=0 • P(Q|D1)=P(Q|~S,~T)P(~S)P(~T)+ +P(Q|~S,T)P(~S)P(T)+ +P(Q|S,~T)P(S)P(~T)+ +P(Q|S,T)P(S)P(T)= .1*.31*1 + .4*.31*0 + .7*0.69*1 +.9*0.69*0=0.514 Cutler cs533

  33. P(Q|D3) • P(search=true|D3)=0.69 • P(tools|D3)=1 • P(Q|D1)=P(Q|~S,~T)P(~S)P(~T)+ +P(Q|~S,T)P(~S)P(T)+ +P(Q|S,~T)P(S)P(~T)+ +P(Q|S,T)P(S)P(T)= .1*.31*0 + .4*.31*1 + .7*0.69*0 +.9*0.69*1=0.745 Cutler cs533

  34. Evaluation • Fallout • Relevance judgements • 11 point recall/precision • Average recall/precision Cutler cs533

  35. Problems with Recall & precision • Recall • undefined when there are no relevant documents • Precision • undefined when no documents are retrieved Cutler cs533

  36. Fallout • Fallout= • A good system should have high recall and low fallout Cutler cs533

  37. Relevance judgment • Exhaustive? • Assume 100 queries • 750,000 documents in collection • Requires 75 million relevance judgments Cutler cs533

  38. Relevance judgment • Sampling? • with 100 queries, average 200 and maximum 900 relevant docs per query, and 750,000 documents, the size of the sample needed for good results is still too large Cutler cs533

  39. Relevance judgment • Polling used in TREC • It is assumed that all relevant documents have been retrieved by one of the competitors • 200 top documents of 33 runs, an average of 2398 docs per topic Cutler cs533

  40. 11 point recall/ precision • 11 point (sometimes 20 point) average precision is used to compare systems. • An alternative is to compute recall and precision at N1, N2, N3,…, documents retrieved • Assume 6 relevant docs in collection • 200 documents in collection Cutler cs533

  41. Recall & precision Cutler cs533

  42. Recall & precision Cutler cs533

  43. Precision 1.0 Relevant nonrelevant 1 2 4 0.75 7 3 10 0.57 5 6 0.5 13 0.46 8 9 11 12 200 0.0 1.0 0.5 0.6 0.8 Recall Cutler cs533

  44. Interpolated values • The interpolated precision is the maximum precision at this and all higher recall levels Cutler cs533

  45. Precision Interpolated Values 1.0 1 2 4 0.75 7 3 10 0.57 5 6 0.5 0.46 8 9 11 12 200 0.0 1.0 0.5 0.6 0.8 Recall Cutler cs533

  46. Averaging performance • Average recall/precision for a set of queries is either user or system oriented • User oriented • Obtain the recall/precision values for each query and • then average over all queries Cutler cs533

  47. Averaging performance • System oriented - use the following totals for all queries: • relevant documents, • relevant retrieved, • total retrieved • User oriented is commonly used Cutler cs533

  48. User oriented recall-level average • Average at each recall level after interpolation Cutler cs533

  49. Precision Query 1 1.0 Query 2 0.75 0.57 0.5 0.46 0.0 1.0 0.5 0.6 0.8 Recall Cutler cs533

More Related