Information Retrieval Techniques

INFORMATION RETRIEVAL TECHNIQUESBYDR. ADNAN ABID Lecture # 25 BENCHMARKS FOR THE EVALUATION OF IR SYSTEMS

ACKNOWLEDGEMENTS The presentation of this lecture has been taken from the underline sources • “Introduction to information retrieval” by PrabhakarRaghavan, Christopher D. Manning, and Hinrich Schütze • “Managing gigabytes” by Ian H. Witten, ‎Alistair Moffat, ‎Timothy C. Bell • “Modern information retrieval” by Baeza-Yates Ricardo, ‎ • “Web Information Retrieval” by Stefano Ceri, ‎Alessandro Bozzon, ‎Marco Brambilla

Outline • Evaluation Measures • Precision and Recall • Unranked retrieval evaluation • Trade-off between Recall and Precision • Computing Recall/Precision Points

Evaluation Measures • Precision • Recall • Accuracy • Mean Average Precision • F-Measure/E-Measure • Non Binary Relevance • Discounted Cumulative Gain • Normalized Discounted Cumulative Gain

retrieved & irrelevant Not retrieved & irrelevant Entire document collection irrelevant Relevant documents Retrieved documents retrieved & relevant not retrieved but relevant relevant retrieved not retrieved Precision and Recall

Unranked retrieval evaluation:Precision and Recall • Precision: fraction of retrieved docs that are relevant = P(relevant|retrieved) • Recall: fraction of relevant docs that are retrieved = P(retrieved|relevant) • Precision P = tp/(tp + fp) • Recall R = tp/(tp + fn)

Should we instead use the accuracy measure for evaluation? • Given a query, an engine classifies each doc as “Relevant” or “Nonrelevant” • The accuracy of an engine: the fraction of these classifications that are correct • ACCURACY = (tp + tn) / ( tp + fp + fn + tn) • Accuracy is a commonly used evaluation measure in machine learning classification work • Why is this not a very useful evaluation measure in IR?

Precision and Recall • Precision • The ability to retrievetop-ranked documents that are mostly relevant. • Recall • The ability of the search to find all of the relevant items in the corpus.

Determining Recall is Difficult • Total number of relevant items is sometimes not available: • Sample across the database and perform relevance judgment on these items. • Apply different retrieval algorithms to the same database for the same query. The aggregate of relevant items is taken as the total relevant set.

Returns relevant documents but misses many useful ones too The ideal Returns most relevant documents but includes lots of junk Trade-off between Recall and Precision 1 Precision 0 1 Recall

Computing Recall/Precision Points • For a given query, produce the ranked list of retrievals. • Adjusting a threshold on this ranked list produces different sets of retrieved documents, and therefore different recall/precision measures. • Mark each document in the ranked list that is relevant according to the gold standard. • Compute a recall/precision pair for each position in the ranked list that contains a relevant document.

Computing Recall/Precision Points: Example 1 Let total # of relevant docs = 6 Check each new recall point: R=1/6=0.167; P=1/1=1 R=2/6=0.333; P=2/2=1 R=3/6=0.5; P=3/4=0.75 R=4/6=0.667; P=4/6=0.667 Missing one relevant document. Never reach 100% recall R=5/6=0.833; p=5/13=0.38

Information Retrieval Techniques - Lecture 25: Benchmarks for Evaluation of IR Systems

Information Retrieval Techniques - Lecture 25: Benchmarks for Evaluation of IR Systems

Presentation Transcript

INFORMATION RETRIEVAL TECHNIQUES BY DR . ADNAN ABID

Dr. Muhammad Adnan Hashmi

Retrieval and Evaluation Techniques for Personal Information

Information Retrieval Techniques

Learning Techniques for Information Retrieval

Dr. Adnan Iqbal

Dr. Abid Qaiyum Suleri

Information Retrieval and Recommendation Techniques

BY .DR HINA ADNAN

Dr. Muhammad Adnan Hashmi