100 likes | 113 Views
This tutorial aims to guide users in conducting a user-centered evaluation of information retrieval systems, specifically focusing on comparing different expert finding systems based on user-centered criteria. The tutorial provides step-by-step instructions and examples for conducting the evaluation, including the use of query performance and precision and recall measures.
E N D
Goal • Understand how to do user-centered information system evaluation • Using query performance of different queries generated by users to measure the efficiency of different information retrieval systems. • Evaluating systems using precision and recall based measures
Example • Scenario: Finding expert is important, especially when you want to find potential collaborators, ask for consultancy, look for potential reviewers, etc. • Aim: comparing two expert finding systems • Pair 1: http://vivo.cornell.edu vs. http://experts.umich.edu • Pair 2: http://vivosearch.org vs. http://www.arnetminer.org • Pair 3: Google vs. Yahoo • Goal: which one is a better expert finding system based on the following criteria • easy to use, user-friendly • quality of search
How • Approach (2 persons form a group) • Demographic questionnaire • About your user, whether he/she is familiar with this system • Pre-evaluation questionnaire • whether you are familiar with the topic • Evaluation • Easy to use, user-friendly • Develop a survey/questionnaire with 3-5 5-scale questions and 1-2 open-end questions • E.g., easy to find help file, easy to know where you are, focusing on the design of the system • Quality of search (2 queries) • Pair 1: Q1: find experts working on “stem cell”, Q2: create another query • Pair 2: Q1: find experts good at “data mining”, Q2: create another query • Pair 3: Q1: create your own query, Q2: create another query • Post-evaluation questionnaire • Usability • Whether the system is easy to search, does it take too long to get the result, whether the results are ranked, whether each ranked URL has short summary • Exit questionnaire • preference
Measuring quality of search Same for Query 2
User judgment of relevance • Fill in the following table with the title and short summary of top 5 results • Show the results to your partner and let your partner decide the relevance (0=irrelevant, 1=relevant)
Calculate Precision@topN Precision@Top3=(1+0+1)/3=0.67 Precision@Top5=(1+0+1+0+1)/5=0.6 Precision@Top3_S1= (Precision@Top3_Q1+Precision@Top3_Q2)/2
Compare your results with your partner’s • Will your results be consistent with your partner’s? • Compare the precision@top3 and precision@top5 • If your results are consistent, you can draw the conclusion, which system performs better • If your results are not consistent, what can you draw, why? • Limitation and potential reasons.
Evaluation Report • Write a short report (1-2 pages) about this evaluation • Including the evaluation outputs • Quantitative analysis • Qualitative analysis • Final conclusions • Limitation • Lessons-Learned • If you want to have this as a real evaluation, what can you improve.