180 likes | 245 Views
Evaluation of IR Performance. Dr. Bilal IS 530 Fall 2006. Searching for Information. Imprecise Incomplete Tentative Challenging. IR Performance. Precision Ratio = the number of relevant documents retrieved the total number of documents retrieved. IR Performance. Recall Ratio =
E N D
Evaluation of IR Performance Dr. Bilal IS 530 Fall 2006
Searching for Information • Imprecise • Incomplete • Tentative • Challenging
IR Performance Precision Ratio = the number of relevant documents retrieved the total number of documents retrieved
IR Performance Recall Ratio = the number of relevant documents retrieved the total number of relevant documents
Why Some Items Are Not Retrieved? • Indexing errors • Wrong search terms • Wrong database • Language variations Other (to be answered by students)
Why Do We Get Unwanted Items or Results? • Indexing errors • Wrong search terms • Homographs • Incorrect term relations Other (to be answered by students)
Boolean Operators • OR increases recall • AND increases precision • NOT increases precision by elimination
Recall and Precision in Practice • Inversely related • Search strategies designed for high precision or high recall (or medium) • Needs of users dictate search strategy towards recall or precision • Practice helps changing queries to favor recall or precision
Recall and Precision 1.0 Recall 1.0 Precision
Relevance • A match between a query and information retrieved • A judgment • Can be judged by anyone who is informed of the query and views the retrieved information
Relevance • Judgment is dynamic • Documents can be ranked by likely relevance • In practice, not easy to measure • Not focused on user needs
Pertinence • Based on information need rather than a match between a query and retrieved documents • Can only be judged by user • May differ from relevance judgment
Pertinence • Transient, varies with many factors • Not often used in evaluation • May be used as a measure of satisfaction • User-based, as opposed to relevance
High Precision Search • Use these strategies, as appropriate: • Controlled vocabulary • Limit feature (e.g., specific fields, major descriptors, date(s), language, as appropriate) • AND operator • Proximity operators carefully • Truncation carefully
High Recall Search • Use these strategies, as appropriate: • OR logic • Keyword searching • No or minimal limit to specific field(s) • Truncate • Broader terms
Relevance Judgment • Users base it on: • Topicality • Aboutness • Utility • Pertinence • Satisfaction
Improving IR Performance • Good mediation of search topic before searching • User presence during search, if possible • Preliminary search judged by user • Evaluation during search (by searcher or by searcher and user) • Refinement of search strategies • Searcher evaluation of final results • User evaluation of final results
Improving IR Performance • Better system design • Better indexing and word parsing • Better structure of thesauri • Better user interface (e.g., more effective help feature) • Better error recovery feedback • User-centered design