220 likes | 355 Views
Relevance Feedback for the Earth Mover‘s Distance. Marc Wichterich , Christian Beecks, Martin Sundermeyer, Thomas Seidl Data Management and Data Exploration Group RWTH Aachen University, Germany. Introduction. Distance-based Adaptable Similarity Search
E N D
Relevance Feedback for theEarth Mover‘s Distance Marc Wichterich, Christian Beecks, Martin Sundermeyer, Thomas Seidl Data Management and Data Exploration GroupRWTH Aachen University, Germany
Introduction • Distance-based Adaptable Similarity Search • Similarity of objects defined by distance function • Small distance → similar, large distance → dissimilar • Query by example: user-given object, find similar ones • Query and distance only approximate descriptions of user’s desired result • If delivered result does not meet expectations: • Bad query? Bad distance? Bad database? • How to do it better? • Relevance Feedback attempts to adapt query/similarity model based on simple user input (result relevancy)
Relevance Feedback Earth Mover’s Distance • EMD • RF query, feedback feedback system similarity model user Photo: Flickr / Caro Wallis results DB • RF for EMD
Overview • Introduction • Adaptive Similarity Model • Feature Signatures • The Earth Mover’s Distance • Relevance Feedback for the Earth Mover’s Distance • Experimental Evaluation • Conclusion
Similarity Model – Feature Signatures y color x
Similarity Model – Earth Mover’s Distance • Introduced in Computer Vision by Rubner et al. • Used in many differing application domains • Idea: transform features of Q into features of P • EMD: minimum of transformation cost P y y Q x x
EMD – Formal Definition • Modeled as linear optimization (transportation problem)
Overview • Introduction • Adaptive Similarity Model • Relevance Feedback for the Earth Mover’s Distance • The Feedback Loop • Query Adaptation • Heuristic EMD Adaptation • Optimization-based EMD Adaptation • Experimental Evaluation • Conclusion
The Feedback Loop start query, feedback get query feedback system retrieve results adapt distance similarity model ? user display results adapt query results DB no satisfied? get feedback yes exit
Query Adaptation start • Input: signatures from relevant objects • Output: new query signature • Idea: cluster signature elements • Refinements by Rubner: • Only keep clusters with elements from majority of signatures • Reweight resulting signatureaccordingly • Combine with fixed gd L2 and call it „Query-by-Refinement“ • „Query-by-Refinement“ is baseline for our evaluation • We adapt EMD via ground distance getquery retrieveresults distance displayresults query satisfied? feedback exit
Heuristic EMD Adaptation 1 start • Approach: pick gd based on feedback • gd should reflect user preferences: • Don’t care if blue cluster at upper half of image is moved left/right • Do care if it is moved vertically • Use variance information in relevant feedback • Low variance → assume user cares • High variance → assume user does not care • Measure variance in feedback locally around query signature elements ci(Q). • Define gd: c(Q) x FS → R ( ) getquery retrieveresults distance displayresults query satisfied? feedback exit
Heuristic EMD Adaptation 2 start • Not 1 but m distance functions: • gdi(ci(Q),y) = ((ci(Q)- y) Vi (ci(Q)- y)T)½ • Weighted Euclidean Distances (weights on diagonal of Vi) • Vi : inverted variance for ci(Q) per feature space dimension getquery retrieveresults distance displayresults query satisfied? feedback exit
Optimization-Based EMD Adaptation 1 start • Aim: Pick best possible gd. • Failback: Find a good one. • Q: When is gd good? A: If ranking it produces is good. • New Q: When is a ranking of DB good? • Given ground truth, a number of measures exist • We used “average precision at relevant positions” • We have ground truth for part of the DB: feedback • Idea: test candidates for gd on feedback getquery retrieveresults distance displayresults query satisfied? feedback exit
Optimization-Based EMD Adaptation 2 start • Optimization: • Optimization variable: gd • Objective function: avgPrec(EMDgd, q, Feedback) • Constraints: m weighted Euclidean distances • Analytic optimization with closed form for weights infeasible (ranking/sorting, EMDs in objective function) • Probabilistic optimization via Simulated Annealing • Start with some initial solution • Move in solution space • Compute objective function • Adopt solution with certain probability • Iterate & turn more greedy getquery retrieveresults distance displayresults query satisfied? feedback exit
Optimization-Based EMD Adaptation 3 start • Optimization for EMD based on Feedback: • Solution: weights for m weighted Euclidean distances • Initial solution: given by heuristic • Moving: redistribute weights per Euclidean distance • Objective function: avgPrec(EMDgd, q, Feedback) • Results for EMDgd on DB? getquery retrieveresults distance displayresults query satisfied? feedback exit
Overview • Introduction • Adaptive Similarity Model • Relevance Feedback for the Earth Mover’s Distance • Experimental Evaluation • Conclusion
Experimental Evaluation: Databases 72,000 images in ALOI DB ~60,000 images in COREL DB
Experimental Evaluation: ALOI Query-by-Refinement Heuristic Adaptation Optimization-based
Experimental Evaluation: COREL Query-by-Refinement Heuristic Adaptation Optimization-based
Experimental Evaluation • After 5 iterations of looking for doors in COREL: (a) Query-by-Refinement (b) Heuristic (c) Optimization-Based pos 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Conclusion • Exploited adaptability of the EMD in RF framework • Goal: Improve similarity search results • Techniques: • Baseline: fixed ground distance • Statistics-based heuristic adaptation • Optimization-based adaptation • Evaluation: • Experiments on two image datasets • More relevant objects in fewer iterations • Techniques extensible to other adaptable distance functions