1 / 28

Advanced Practices in Information Retrieval Performance Evaluation

Explore the theory and practice of information retrieval performance evaluation, including recall and precision measures. Learn how to analyze recall/precision curves and utilize R-precision for optimal retrieval methods. Discover how to assess database structure with the blue sheet.

isalvador
Download Presentation

Advanced Practices in Information Retrieval Performance Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LIS618 lecture 2 Thomas Krichel 2004-02-07

  2. Structure • Theory: information retrieval performance • Practice: more advanced dialog.

  3. retrieval performance evaluation • "Recall" and "Precision" are two classic measures to measure the performance of information retrieval in a single query. • Both assume that there is an answer set of documents that contain the answer to the query. • Performance is optimal if • the database returns all the documents in the answer set • the database returns only documents in the answer set • Recall is the fraction of the relevant documents that the query result has captured. • Precision is the fraction of the retrieved documents that is relevant.

  4. recall and precision curves • Assume that all the retrieved documents arrive at once and are being examined. • During that process, the user discover more and more relevant documents. Recall increases. • During the same process, at least eventually, there will be less and less useful document. Precision declines (usually). • This can be represented as a curve.

  5. Example • Let the answer set be {0,1,2,3,4,5,6,7,8,9} and non-relevant documents represented by letters. • A query reveals the following result: 7,a,3,b,c,9,n,j,l,5,r,o,s,e,4. • For the first document, (recall, precision) is (10%,100%), for the third, (20%,66%), for the sixth (30%,50%), for the tenth (40%,40%) for the (30%,33%)

  6. recall/precision curves • Such curves can be formed for each query. • An average curve, for each recall level, can be calculated for several queries. • Recall and precision levels can also be used to calculate two single-valued summaries. • average precision at seen document • R-precision

  7. R-precision • a more ad-hoc measure. • Let R be the size of the answer set. • Take the first R results of the query. • Find the number of relevant documents • Divide by R. • In our example, the R-precision is 40%. • An average can be calculated for a number of queries.

  8. average precision at seen document • To find it, sum all the precision level for each new relevant document discovered by the user and divide by the total number of relevant documents for the query. • In our example, it is (100+66+50+44+ 33)/5=57% • This measure favors retrieval methods that get the relevant documents to the top.

  9. critique of recall & precision • Recall has to be estimated by an expert • Recall is very difficult to estimate in a large collection • They focus on one query only. No serious user works like this. • There are some other measures, but that is more for an advanced course in IR.

  10. Looking at database structure • Up until now, we have looked at commands that take a full-text view of the database. • Such commands can be executed for every database. • If we want to make more precise queries, we have to take account of database structure.

  11. blue sheet • each database name is linked to a blueish pop-up window called the blue sheet for the database • This is called the bluesheet. • It contains the details of the database.

  12. closer look at the bluesheet • file description • subject coverage (free vocabulary) • format options, lists all formats • by number (internal) • by dialog web format (external, i.e. cross-database) • search options • basic index, i.e. subject contents • additional index, i.e. non-subject

  13. basic vs additional index • the basic index • has information that is relevant to the substantive contents of the data • usually is indexed by word, i.e. connectors are required • the additional index • has data that is not relevant to the substantive matter • usually indexed by phrase, i.e. connectors are not required

  14. search options: basic index • select without qualifiers searches in all fields in the basic index • bluesheet lists field indicators available for a database • also note if field is indexed by word or phrase. proximity searching only works with word indices. when phrases are indexed you don't need proximity indicators

  15. search in basic index • a field in the basic index is queried through term/IN, where term is a search term and IN is a field indicator • Thomas calls this a appending indicator • several field indicators can be ORed by giving a comma separated list • for example mate/ti,de searches for mate in the title or descriptor fields

  16. limiters and sorting • Some databases allow to restrict the search using limiters. For example • /ABS require abstract present • /ENG English language publication • Some fields are sortable with the sort command, i.e. records can be sorted by the values in the fields. Example: “sort /ti” Such features are database specific.

  17. additional indices • additional indices lists those terms that can lead a query. Often, these are phrase indexed. • Such fields a queried by prefix IN=term where IN is the field abbreviator and term is the search term • Thomas calls this a pre-pending indicator

  18. expanding queries • names have to be entered as they appear in the database. • The "expand" command can be used to see varieties of spelling of a value • It has to be used in conjunction with a field identifier, example • expand au=cruz, b? • expand au=barrueco? to search for misspellings of José Manuel Barrueco Cruz

  19. expanding queries II • search produces results of the form Ref Items Index-term • Ref is a reference number • Items is the number of items where the index term appears • Index-term is the index term • "s Ref" searches for the reference term.

  20. expand topics • You can also expand a topic in a database to see what index terms are available that start with the term. • If you expand an entry in the expansion list again, you can see a list of related terms to the term, if such a list is available.

  21. Example • How many domain names are currently registered in Novosibirsk, Russia? • Hint: use domain name database file 225. • Note that this database also covers non-current domains.

  22. ranking • The rank command can be use to show the most frequent values of a phrase indexed field in a search set. • Example • rank au s1 shows the most frequent authors in a set of result • rank de s1 shows most frequent descriptors

  23. example • Who wrote on interest rates and growth rates. Use EconLit • b 139 • s interest(n)rate? and growth(n)rate? • rank au s1 • You can then set some authors you are interested in, 1-5 for example • exit / exs to search for those authors.

  24. topic searches • Often we want to know what literature is available on a certain topic. • Many times authors do not use obvious words that occur to the searcher. • Using descriptors can be very helpful. • Conduct a search • Look for descriptors • Use those in other searches

  25. Initial file selection • On the main menu, go to the database menu. • After the principle menu, you get a search box • There you can enter full-text queries for all the databases • You can then select the database you want • And get to the begin databases stage.

  26. database categories • In order to help people to find databases (files), DIALOG have grouped databases by categories. • categories are listed at http://library.dialog.com/bluesheets/html/blo.html • 'b category' will select databases from the category category at the start. • 'sf category' selects files belonging to a category category at other times.

  27. add/repeat • add number, number adds databases by files to the last query • example "add 297" to see what the bible says about it • repeat repeats previous query with database added

  28. http://openlib.org/home/krichel Thank you for your attention!

More Related