1 / 49

CS533 Information Retrieval

CS533 Information Retrieval. Dr. Michal Cutler Lecture #11 March 1, 1999. This Class. Inference networks conditional independence belief nets Inference nets in Information Retrieval Evaluation. Conditional Independence.

edeborah
Download Presentation

CS533 Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS533 Information Retrieval Dr. Michal Cutler Lecture #11 March 1, 1999

  2. This Class • Inference networks • conditional independence • belief nets • Inference nets in Information Retrieval • Evaluation

  3. Conditional Independence • Variable V is conditionally independent of a set of variables V1 given another set of variables V2 if: P(V|V1,V2)=P(V|V2) • Intuition: V1 gives us no more information about V than we already know because of knowing V2.

  4. Conditional Independence • Our belief that a person will have a Steady job, given evidence on Education and Job, is independent of whether or not the person owns a Home • P(Steady|Ed, Job, Home) = P(Steady|Education , Job)

  5. Belief networks A belief network is a graph in which: 1. A set of random variables make up the nodes of the network 2. A set of directed edges connect pairs of nodes. The intuitive meaning of an arrow from node X to node Y is that X has direct influence on Y

  6. Belief networks 3. Each node has a link matrix (conditional probability table) that quantifies the effects the parents have on the node 4. The graph has no directed cycles

  7. Example:Getting a loan • Education and a Job influence having a Steady job • Family influences having a Guarantor for a loan • Steady job and Guarantor influence getting a Loan

  8. P(F) .7 P(J) .8 E J P(S) T T .9 T F .6 F T .7 F F .1 S G P(L) T T 1 T F 0 F T 0 F F 0 P P(G) T .7 F .2 Getting a loan P(E) .6 E J F S G L

  9. Computing the conditional probabilities • The conditional probabilities can be computed from experience (past records of loan applications) • Or, use formulas for computing the conditional probabilities • For example the link matrix for Loan is of type LAND.

  10. Semantics of belief networks • To construct a net, think of representing the joint probability distribution. • Every entry in the joint probability distribution can be computed from the information in the network

  11. Probability of getting a loan • We can use the belief network to compute the probability .39 of getting a loan (next foil) • If we have a person with a job, education, and home the probability of getting a loan is higher. P(L=true|E,J,H)=P(L=true|S,G)* P(S|E,J)*P(G|H)=1*.9*.7=.63

  12. Inference Networks for IR • Turtle and Croft introduced the inference network model for information retrieval • This is a probability-based method • Ranks documents by probability of satisfying a user's information need.

  13. Document / concept/ query network Document network Concept network Query network

  14. Each document can have many representations The relationships between concepts generate the concept network The Document/Concept Network

  15. The information need may be based on complex interactions between various sources of evidence and different representations of the user's need. The Query Network

  16. d2 d3 di d1 r3 r1 r2 rj q1 qk I An inference network

  17. Inference Networks • There are four kinds of nodes in this network.

  18. Inference Networks • The di nodes represent particular documents and correspond to the event of observing that document. • The rj nodes are concept representation nodes. These correspond to the concepts that describe the contents of a document.

  19. Inference Networks • The qk nodes are query nodes. • They correspond to the concepts used to represent the information need of the user. • The single leaf node I corresponds to the (unknown) information need.

  20. Inference Networks • To evaluate a particular document, the single corresponding node is instantiated and the resulting probabilities are propagated through the network to derive a probability associated with the I node. • To generate a ranking for all documents in the collection, this occurs for each of the nodes in the network.

  21. Inference Networks • Each di node is instantiated only once and no other nodes are active at the same time. • The probabilities associated with child nodes are based on the probabilities of their parents and on a ``link matrix'' that specifies how to combine the evidence from the parents.

  22. Inference Networks • The ``link matrices'' between the di nodes and the rj nodes represent the evidence for the proposition that this concept occurs in this document. • The link matrices between the rj nodes and the qk nodes specify how the representation concepts are combined to form the final probability.

  23. Inference Networks • The di and rj nodes are static for a given static collection and are constructed independently of any particular query. • The qk and I portions of the network are constructed individually for each query.

  24. Example • D1: Searching the web • D2: Spiders for searching the web • D3: Tools for searching files • After indexing the terms are: file, search, spider, tool, web • Assume a query “Search tools”

  25. Computing df/idf • df(file)=1, idf(file)=lg(3/1)+1=2.58 • df(search)=3, idf(search)=lg(3/3)+1=1 • df(spider)=1, idf(spidera)=lg(3/1)=2.58 • df(tool)=1, idf(tool)=2.58 • df(web)=2, idf(web)=lg(3/2)+1=1.58

  26. The term/document matrix ntf

  27. S T P(Q) T T .9 T F 0.7 F T 0.4 F F 0.1 The inference net D1 D2 D3 Search Tools Q

  28. Link matrices • The link matrix for the query terms is supplied by the user • If the query was Boolean and had AND, OR, and NOT nodes link matrices for each could be used • Concept probability is: P(rj|Di) = 0.5+0.5*ntfj*nidfi

  29. Link matrix for “search” P(search|D1)=0.5+0.5*ntf*nidf=0.5+0.5*1*0.38=0.69

  30. Link matrix for “tools”

  31. P(Q|D1) • P(search=true|D1)=0.69 • P(tools|D1)=0 • P(Q|D1)=P(Q|~S,~T)P(~S)P(~T)+ +P(Q|~S,T)P(~S)P(T)+ +P(Q|S,~T)P(S)P(~T)+ +P(Q|S,T)P(S)P(T)= .1*.31*1 + .4*.31*0 + .7*0.69*1 +.9*0.69*0=0.514

  32. P(Q|D3) • P(search=true|D3)=0.69 • P(tools|D3)=1 • P(Q|D1)=P(Q|~S,~T)P(~S)P(~T)+ +P(Q|~S,T)P(~S)P(T)+ +P(Q|S,~T)P(S)P(~T)+ +P(Q|S,T)P(S)P(T)= .1*.31*0 + .4*.31*1 + .7*0.69*0 +.9*0.69*1=0.745

  33. Evaluation • Fallout • Relevance judgements • 11 point recall/precision • Average recall/precision

  34. Problems with Recall & precision • Recall • undefined when there are no relevant documents • Precision • undefined when no documents are retrieved

  35. Fallout • Fallout= • A good system should have high recall and low fallout

  36. Relevance judgment • Exhaustive? • Assume 100 queries • 750,000 documents in collection • Requires 75 million relevance judgments

  37. Relevance judgment • Sampling? • with 100 queries, average 200 and maximum 900 relevant docs per query, and 750,000 documents, the size of the sample needed for good results is still too large

  38. Relevance judgment • Polling used in TREC • It is assumed that all relevant documents have been retrieved • 200 top documents of 33 runs, an average of 2398 docs per topic

  39. 11 point recall/ precision • 11 point (sometimes 20 point) average precision is used to compare systems. • An alternative is to compute recall and precision at N1, N2, N3,…, documents retrieved • Assume 6 relevant docs in collection • 200 documents in collection

  40. Recall & precision

  41. Recall & precision

  42. Precision 1.0 Relevant nonrelevant 1 2 4 0.75 7 3 10 0.57 5 6 0.5 13 0.46 8 9 11 12 200 0.0 1.0 0.5 0.6 0.8 Recall

  43. Interpolated values • The interpolated precision is the maximum precision at this and all higher recall levels

  44. Precision Interpolated Values 1.0 1 2 4 0.75 7 3 10 0.57 5 6 0.5 0.46 8 9 11 12 200 0.0 1.0 0.5 0.6 0.8 Recall

  45. Averaging performance • Average recall/precision for a set of queries is either user or system oriented • User oriented • Obtain the recall/precision values for each query and • then average over all queries

  46. Averaging performance • System oriented - use the following totals for all queries: • relevant documents, • relevant retrieved, • total retrieved • User oriented is commonly used

  47. User oriented recall-level average • Average at each recall level after interpolation

  48. Precision Query 1 1.0 Query 2 0.75 0.57 0.5 0.46 0.0 1.0 0.5 0.6 0.8 Recall

More Related