180 likes | 365 Views
Evaluation of IR Performance. Dr. Bilal IS 530 Fall 2005. Searching for Information . Imprecise Incomplete Tentative Challenging. IR Performance . Precision Ratio = the number of relevant documents retrieved the total number of documents retrieved. IR Performancel. Recall Ratio =
E N D
Evaluation of IR Performance Dr. Bilal IS 530 Fall 2005
Searching for Information • Imprecise • Incomplete • Tentative • Challenging
IR Performance Precision Ratio = the number of relevant documents retrieved the total number of documents retrieved
IR Performancel Recall Ratio = the number of relevant documents retrieved the total number of relevant documents
Why Do We Miss Items? • Indexing errors • Wrong search terms • Wrong database • Language variations • Other???
Why Do We Get Unwanted Items? • Indexing errors • Wrong search terms • Homographs • Incorrect term relations • Other???
Boolean Operators • OR increases recall • AND increases precision • NOT increases precision
Recall and Precision in Practice • Inversely related • Search strategies designed for high precision or high recall (or medium) • Needs of users dictate search strategy towards recall or precision • Practice helps changing queries to favor recall or precision
Recall and Precision 1.0 Recall 1.0 Precision
Problems with Relevance, Recall, and Precision • Yes or no decision • Things are more or less relevant • In practice not easy to measure • Not focused on user needs
Relevance • A match between a query and information retrieved • Is a judgment • Can be judged by anyone who is informed of the query and views the retrieved information
Relevance (cont.) • Judgments may differ • Is the base for information retrieval evaluation methods (recall and precision) • Documents can be ranked by likely relevance
Pertinence • Based on information need rather than request and documents • Can only be judged by user • May differ from relevance judgments
Pertinence (cont.) • Transient, varies with many factors • Not often used in evaluation • May be used as a measure of satisfaction
High Precision Searching • Controlled vocabulary • Limits: Specific fields, major descriptors, Date, language, etc. • AND operator • Proximity • Careful with truncation
High Recall Searching • OR logic • Keyword searching • No limits • Truncate • Broader terms
Related Concepts • Topicality • Aboutness • Utility • Pertinence • Satisfaction
Hints for Improving Performance • Good interview • User presence, if possible • Preliminary search and user response • Evaluation during search (you or you and user) • User feedback • Search refinement as you progress