Analyzing Retrieval Models using Retrievability Measurement

Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas Rauber Institute of Software Engineering and Interactive Systems Vienna University of Technology bashir@ifs.tuwien.ac.at http://www.ifs.tuwien.ac.at/~bashir/

Outline • Introduction to Retrievability (Findability) Measure • Setup for Experiments • Findability Scoring Functions • Relationship between Findability and Query Characteristics • Relationship between Findability and Document Features • Relationship between Findability and Effectiveness Measures

Introduction • Retrieval Systems are used for searching information • Rely on retrieval models for ranking documents • How to select best Retrieval Model • Evaluate Retrieval Models • State of the Art • Effectiveness Analysis, or • Efficiency (Speed/Memory)

Effectiveness Measures • (Precision, Recall, MAP) depends upon • Few topics • Few judged documents • Suitable for precision oriented retrieval task • Less suitable for recall oriented retrieval task • (e.g. patent or legal retrieval)

Findability Measure • Considers all documents • The goal is to maximize the findability of documents • Documents in Retrieval Model having higher findability are more easy to find than Retrieval Model having lower findability • Applications • Offers another measure for comparing Retrieval Models • Subset of documents that are hard or easy to find

Findability Measure • Factors that affect Findability • User Query • [Query = Data Mining books] vs [Query = Han Kamber books] • for searching book “Data Mining Concepts and Techniques” • The maximum number of top links/docs checked • The ranking strategy of Retrieval Models

Retrievability Measure [Leif Azzopardi and Vishwa Vinay, CIKM 2008] Given a collection D of documents, and query set Q retrievability of dD kdq rank of dD in the result set of query qQ c the point in rank list where user will stop f(kdq,c) =1 if kdq<= c, and 0 otherwise Gini-Coefficient = Summarize findability scores

Outline • Introduction to Findability Measure • Setup for Experiments • Retrievability Scoring Functions • Relationship between Findability and Query Characteristics • Relationship between Findability and Document Features • Relationship between Findability and Effectiveness Measures

Setup for Experiments • Collections • TREC Chemical Retrieval Track Collection 2009 (TREC-CRT) • USPTO Patent Collections • USPC Class 433 (Dentistry) (DentPat) • USPC Class 422 (Chemical apparatus and process disinfecting, deodorizing, preserving, or sterilizing) (ChemAppPat) • Austrian News Dataset (ATNews) TREC-CRT, ATNews are more skewed USPTO Collections are less skewed

Setup for Experiments • Retrieval Models • Standard Retrieval Models • TFIDF, NormTFIDF, BM25, SMART • Language Models • Jelinek-Mercer Smoothing, Dirichlet Smoothing (DirS), Two-Stage Smoothing (TwoStage), Absolute Discounting Smoothing (AbsDis) • Query Generation • All sections of Patent documents • Terms removed with document frequency (df) > 25% • All term combinations of 3- and 4-terms

Setup for Experiments 52 443 583 746 962 1474 Docs. Ordered by Increasing Vocabulary Size 5 101 155 198 255 427 Docs. Ordered by Increasing Vocabulary Size TREC-CRT ATNews 243 597 690 776 895 Docs. Ordered by Increasing Vocabulary Size 284 381 426 463 504 559 866 Docs. Ordered by Increasing Vocabulary Size DentPat ChemAppPat

Outline • Introduction to Retrievability Measure • Setup for Experiments • Findability Scoring Functions • Relationship between Findability and Query Characteristics • Relationship between Findability and Document Features • Relationship between Findability and Effectiveness Measures

Findability Scoring Functions • Standard Findability Scoring Function • Does not consider the difference in Docs. vocabulary size • Biased towards long documents • With r(d), Doc2 has higher Findability than Doc5 • But, due to small vocabulary size Doc5 does not have larger query subset All 3-Terms combinations Findability Percentage Doc2 = 3600/6545 = 0.55 Doc5 = 90/120 = 0.75

Findability Scoring Functions • Normalize Findability • Normalize r(d) relative to number of Queries generated from d • This will account for the difference between doc lengths • (d) queries generated from d

Findability Scoring Functions • Comparison between r(d) and r^(d) • Retrieval ordered by Gini-Coefficients (Retrieval Bias) • Findability Ranks of Documents

Findability Scoring Functions • Correlation between r(d) and in Terms of Gini-Coefficients Retrieval Models are ordered by r(d) and r^(d) ChemAppPat TREC-CRT

Findability Scoring Functions • Correlation between r(d) and in Terms of Documents Findability Ranks • TREC-CRT and ATNews • The correlation between r(d) and is low (high difference) • Due to large difference between document lengths • ChemAppPat and DentPat • The correlation between r(d) and is high (low difference) • Due to not large difference between document lengths Correlation between r(d) and r^(d) Back

Findability Scoring Functions • Which Findability Functions is better (r(d) or r^(d) ). • On Gini-Coefficient it is difficult to decide Ordered the documents based on findability scores and then partitioned into 30 Buckets . . . . . Bucket 1 Bucket 2 Bucket 30 Low Findability Buckets <---------------------------------> High Findability Buckets . . . . . 40 Random Docs (Known Items) 40 Random Docs (Known Items) 40 Random Docs (Known Items) One Query/Document between 4 – 6 length One Query/Document between 4 – 6 length One Query/Document between 4 – 6 length . . . . . The goal is to search known-item using its own Query Effectiveness of Known-Items is measured through Mean Reciprocal Rank (MRR) Low MRR Effectiveness <-------------Expected Results-------> High MRR Effectiveness

Retrievability Scoring Functions • Which Findability Functions is better (r(d) or r^(d) ). • Expected Results • High findability buckets should have high effectiveness, since they are easy to findable than low findability buckets • Positive correlation with MRR • r^(d) buckets have good positive correlation with MRR than r(d) Correlation between Findability and MRR TREC-CRT ChemAppPat

Outline • Introduction to Findability Measure • Setup for Experiments • Findability Scoring Functions • Relationship between Findability and Query Characteristics • Relationship between Findability and Document Features • Relationship between Findability and Effectiveness Measures

Query Characteristics and Findability Current Findability Analysis Style • Queries do not have similar quality • Some queries are very specific (target oriented) than others • What is the effect of query quality on Findability • Need to analyze Findability with different query quality subsets Findability Score of Documents Q = Query Set GINI-Coefficients • Creating Query Quality Subsets • Supervised Quality Labels: We do not have supervised labels • Query Characteristics (QC): • Query Result List size • Query Term Frequencies in the Documents • Query Quality based on Query Performance Prediction Methods For each QC, large query set is partitioned into 50 subsets

Query Characteristics and Findability • Query Subsets with Query Quality Query Subset 1 = Findability Analysis Query Quality is predicted Simplified Clarity Score (SCS) [He & Ounis SPIRE 2004] Q ordered by SCS score And Partitioned into 50 Subsets Query Subset 2 = Findability Analysis Q = Query Set . . . Query Subset 50 = Findability Analysis • X-Axis = Query Subsets ordered by Low SCS score to High SCS score • Y-Axis = Gini-Coefficients • Low SCS scores Subsets = High Gini-Coefficients • High SCS scores Subsets = Low Gini-Coefficients TREC-CRT Collections

Outline • Introduction to Findability Measure • Setup for Experiments • Retrievability Scoring Functions • Relationship between Findability and Query Characteristics • Relationship between Findability and Document Features • Relationship between Findability and Effectiveness Measures

Document Features and Findability Large Processing Time Large Computation Resources Query Processing Findability Analysis Can we predict Findability without processing exhaustive set of queries Does not require heavy Processing Only predict Findability Ranks Can’t predict Gini-Coefficients Relationship between Document Features and Findability Scores

Document Features and Findability • The following three classes of document features are considered • Surface Level features • Based on (Term Frequencies within Documents) and (Term Document Frequencies within Collection) • Features based on Term Weights • Based on the Term Weighting strategy of retrieval model • Density around Nearest Neighbors of Documents • These features are based on the density around nearest neighbors of documents

Document Features and Findability Surface Level Features

Document Features and Findability TREC-CRT ChemAppPat

Document Features and Findability • Combining Multiple Features • No feature performs best for all collections and for all retrieval models • Worth to analyze to what extent combining multiple features increases the correlation • Regression Tree, 50%/50% training/testing splitting Correlation by combining multiple features Correlation with best single feature % of increase in correlation

Outline • Introduction to Findability Measure • Setup for Experiments • Findability Scoring Functions • Relationship between Findability and Query Characteristics • Relationship between Findability and Document Features • Relationship between Findability and Effectiveness Measures

Relationship between Findability and Effectiveness IR • Automatic Retrieval Models Ranking • Tuning/Increasing Retrieval Model Effectiveness on the basis of Findability Measure Effectiveness Measures (Recall, Precision, MAP) Findability Measure Goal: Maximizing Findability Does not need Relevance Judgments Goal: Maximizing Effectiveness Depends upon Relevance Judgments Does any relationship exist between both? Maximizing Findability -> Maximizing Effectiveness If relationship exists

Relationship between Findability and Effectiveness • Retrieval Models • Standard Retrieval Models and Language Models • Low Level Features of IR (tf, idf, doc length, vocabulary size, collection frequency) • Term Proximity based Retrieval Models

Relationship between Findability and Effectiveness • Correlation exists • Not perfect, but retrieval models having low retrieval bias consistently appear in at least top half of the ranks Correlation = 0.80 0.75 0.80 0.73

Relationship between Findability and Effectiveness • Tuning Parameter values over Findability • Retrieval Models contain parameters • Controls the query term normalization or smooth the document relevance score in case of unseen query terms • We tune the parameter values over findability • Examine this effect on Gini-Coefficient and Recall/Precision/MAP

Relationship between Findability and Effectiveness Parameter b values are changed between 0 to 1

Relationship between Findability and Effectiveness For JM Parameter  values are changed between 0 to 1

Relationship between Findability and Effectiveness • Evolving Retrieval Model using Genetic Programming and Findability • Genetic Programming branch of soft computing • Helps to solve exhaustive search space problems Genetic Programming Retrieval Features Repeat until 100 generations complete Initial population Selecting Best Retrieval Model (Findability Measure) Randomly Combine IR Features Next Generation Recombination (Crossover, Mutation)

Relationship between Findability and Effectiveness • Evolving Retrieval Model using Genetic Programming and Findability • Solution (Retrieval Model) are represented with Tree structure. • Nodes of trees either operators (+,/,*) or ranking features • Ranking Features • Low Level Retrieval Features • Term Proximity based Retrieval Features • Constant Values (0.1 to 1) • 100 Generations are evolved with 50 solutions per generation

Relationship between Findability and Effectiveness • Evolving Retrieval Model using Genetic Programming and Findability • Two correlation analysis are test • (1) Relationship between Findability and Effectiveness on the basis of fittest individual of each generation • (2) Relationship between Findability and Effectiveness on the basis of average fitness of each generation

Relationship between Findability and Effectiveness • Evolving Retrieval Model using Genetic Programming and Findability • (First): Relationship between Findability and Effectiveness on the basis of Fittest individual of each generation

Relationship between Findability and Effectiveness • Evolving Retrieval Model using Genetic Programming and Findability • (Second): Relationship between Findability and Effectiveness on the basis of Average Fitness of each generation • Generations having low average Gini-Coefficient also have high effectiveness on Recall@100

Conclusions • Findability focuses on all documents not set of few judged documents • We propose normalized findability scoring function that produces better findability rank of documents • Analysis between findability and query characteristics • Different ranges of query characteristics have different retrieval bias • Analysis between findability and document features • Suitable for predicting document findability ranks • Relationship between findability and effectiveness • Findability can be used for automatic ranking • Used to find tune IR systems in un-supervised manner

Future Work • Query Popularity and Findability • We are not differentiating between popular and unpopular queries • Visualizing Findability • Documents that are high findable with one model • Documents that are high findable with multiple models • Documents that are not findable with all models • Effect of Retrieval Bias in K-Nearest Neighbor classification • High Findable samples also affect the classification voting in K-NN

Thank You

Gini-Coefficient Gini-Coefficient calculates retrievability inequality between documents. Also represents retrieval bias. Provides bird-eye view. If G = 0, then no bias, If G = 1, then only one document is Findable, and all other document have r(d) = 0. Back

Findability Scoring Functions TREC-CRT ChemAppPat

Document Features and Retrievability • Features based on Term Weights • Terms of Documents are weighted by retrieval model. • Then terms are added into inverted lists. • Term weights in the inverted lists are sorted by decreasing score.

On high skewed collections, these features have good correlation. On less skewed collections, these features do not have good correlation. This may be because, in less skewed collection the term weights of documents are less extreme due to almost similar doc lengths. Document Features and Retrievability TREC-CRT ChemAppPat Back

Document Features and Retrievability • Document Density based Features • These feature are based on average density of the k-nearest neighbors of documents. • k is used with 50,100, and 150. • Density is also computed with all terms of a document and top 40 (high frequency) terms of a document.

Analyzing Retrieval Models using Retrievability Measurement

Analyzing Retrieval Models using Retrievability Measurement

Presentation Transcript

Shape-Based Retrieval of Articulated 3D Models Using Spectral Embedding

Analyzing Data from Small N Designs using Multilevel Models

Automatic Image Annotation and Retrieval using Cross-Media Relevance Models

: Retrieval Models

Using Models

Lecture 2: Retrieval Models

Retrieval models { week 13}

Retrieval Models I

Cluster-Based Retrieval Using Language Models

Analyzing Document Retrievability in Patent Retrieval Settings

Advanced Information- Retrieval Models

Information Retrieval Models

Probabilistic Models in Information Retrieval SI650: Information Retrieval

Alternative Measurement Models

Information Retrieval Models

Retrieval Models

Structured Text Retrieval Models

Retrieval Models II

Analyzing Measurement Data

Analyzing Data from Small N Designs using Multilevel Models