Resource Selection in Distributed Information Retrieval – an Experimental Study

Resource Selection in Distributed Information Retrieval – an Experimental Study Hans Friedrich Witschel (formerly) University of Leipzig (now) SAP Research CEC Karlsruhe

Overview • Motivation • Problem definition • Solutions to be explored • Experimental setup • Results • Conclusions

Motivation

combine harvester cattle crops tractor agricultural acres statics building project landscape anti-seismic design cubature treatment surgery radiation oncology diagnostic bone marrow urology client server servent p2p terms algorithm ranking Motivation Resource selection Whom could I ask about „information retrieval“??

Motivation Resource selection • Reason for selecting only a subset of all available resources/peers: cost reduction • Distributed IR (DIR): time and load on databases • Peer-to-peer IR (P2PIR): amount of messages, we will concentrate on P2PIR here • Basic approach: treat peers/resources as giant documents, use existing (slightly modified) retrieval functions to rank them, visit top-ranked ones…

Problem definition

Problem definition Assumptions • Peers have profiles = lists of terms with weights (unigram language models) • Two options: • Represent peers by what they have → extract terms from a peer‘s shared documents • Represent peers by queries for which they provide relevant documents • Profiles have to be compact in order to reduce communication overhead • absolute size of profiles dictated by available (network) resources

Problem definition Research questions • How much will profile pruning degrade the quality of resource selection? That is, how many terms can we prune from a profile and still have acceptable results? • What can be done to improve peer selection? • Improve queries → Query Expansion? • Improve profiles → Profile adaptation?

Solutions to be explored

Solutions to be explored Preliminaries • Profiles: • use CORI for weighting terms t in the collection of peer p, rank by P(t|p) • Compression: apply simple thresholding • Profile sizes: 10,20,40,80,160,320,640,unpruned • Global term weights (I component of CORI) • Use external reference corpus for estimating idf values • Local retrieval function at each peer: BM25 • Uses the same idf estimations as above=> document scores comparable across all peers=> can concentrate on resource selection process, results not blurred by result merging effects

Solutions to be explored Baselines • Random: Rank peers in random order • By-size: rank peers by the number of documents they hold, independent of offered content • Base CORI: rank peers by the sum of CORI weights of terms contained in both the query and the peer‘s profile

Solutions to be explored Query expansion • All methods use Local Context analysis • Input passages are taken from: • The web: top 10 results snippets returned by Yahoo! API for the query • Local documents: best 10 documents returned by highest-ranked peer (local pseudo feedback) • For comparison („upper QE baseline“): use global view on collection (global pseudo feedback)

Solutions to be explored Profile adaptation • Idea: • Boost weight of term t in peer p‘s profile if p has successfully answered a query containing t • Aim: profile allows the peer to answer popular queries for which it has many relevant documents • Can be done using a query log • Extensions: collaborative tagging approach, allow user interaction etc. (hard to evaluate)

Solutions to be explored Profile adaptation • Update formula for term i in profile of peer p • Dp = documents returned by p • Do = documents returned by all peers contacted • AVGRP = average relative precision (RP) over all peers the query has reached • Update is only executed if ratio > 1, i.e. if p‘s results are „better“ than the average • For evaluation purposes: split a query log into query and test set, use training set for updating profiles

Experimental Setup

Experimental Setup Simplifying it… • Evaluate distributed IR only, instead of running full P2PIR simulation • Decouple query routing from other aspects (overlay topology etc.) • Considerably reduces number of free parameters • Underlying assumption: a resource selection algorithm A that works better than algorithm B for DIR, will also be better for P2PIR (i.e. when only a subset of all resources is visible) • A DIR scenario corresponds to a fully connected P2P overlay (e.g. PlanetP)

Experimental Setup Parameterising it… • DIR evaluation, but: use parameters typical of P2PIR settings: • Pruned profiles • >> 1000 Peers • Peer collections: small and semantically (relatively) homogeneous • All this as opposed to DIR

Experimental Setup Applying it… • Basic evaluation procedure: • Obtain a ranking R of all peers w.r.t. query q • Visit the top 100 peers in the order implied by R • After visiting each peer: merge documents found so far into a ranking S, judge quality of R by the quality of S using e.g. relevance judgments for documents

Experimental Setup Test collections • Digital library scenario: peers = topics • Ohsumed: medical abstracts, annotated with Medical Subject Headings (MeSHs) • GIRT: German sociology abstracts, annotated with terms from a thesaurus • For both collections, queries and relevance judgments are available • Individuals sharing publications: • Citeseer abstracts with peers = (co-)authors • Query log available, but no relevance judgments

Experimental Setup Evaluation measures • Missing relevance judgements: introduce new measure relative precision (RP) • Idea: compare a given ranking D with ranking C of a reference retrieval system (here: centralised system) • Probability of relevance of a document estimated as inverse rank in reference ranking • RP@k = average probability of relevance among first k documents of ranking D C = [K,L,M,N,O,P] D = [L,M,O]

Results

Results Profile pruning, CiteSeer

Results Profile pruning, GIRT

Results Profile pruning, space savings

Results Qualitative analysis

Results Query expansion M=intervals where QE runs significantly better than baseline M‘=intervals where QE significantly worse

Results Profile adaptation

Results Profile adaptation, delayed updates

Conclusions

Conclusions • Profile pruning: • Pruning profiles hurts performance less than expected • Whether or not pruning to a predefined size hurts, does not necessarily depend on the original profile size • In the experiments, it was always safe to prune for (total) space savings of 90% • „Advanced“ techniques: • Query expansion: more often hurts than improves performance • Profile adaptation: • Stable improvement of over 10% among the first 15 peers visited • Especially high improvement for the highest ranked peer • delayed updates do not hurt effectiveness (weak locality)

Questions?

Resource Selection in Distributed Information Retrieval – an Experimental Study