350 likes | 917 Views
PicHunter A Bayesian Image Retrieval System. Project overview. Target Testing and the PicHunter Bayesian Multimedia Retrieval System , I.J. Cox, Matt Miller, S.M. Omohundo, P.N. Yianilos, Proceedings of the Forum on Research & Technology Advances in Digital Libraries, pp 66-75, 1996.
E N D
Project overview • Target Testing and the PicHunter Bayesian Multimedia Retrieval System, I.J. Cox, Matt Miller, S.M. Omohundo, P.N. Yianilos, Proceedings of the Forum on Research & Technology Advances in Digital Libraries, pp 66-75, 1996. • PicHunter: Bayesian Relevance Feedback for Image Retrieval, I.J. Cox, Matt Miller, Stephen Omohundo, P.N. Yianilos, 13th International Conference on Pattern Recognition, Vol.III, Track C, pp.361-369, August 1996. • Introduces PicHunter, the Bayesian framework, and describes a working system including measured user performance. • Hidden Annotation in Content Based Image Retrieval, I.J. Cox, Joumana Ghosn, Matt Miller, T. Papathomas, P.N. Yianilos, IEEE Workshop on Content-Based Access of Image & Video Libraries, pp.76-81, June 1997 • Introduces the idea of ``hidden annotation'', and reports results demonstrating that it improves performance.
Project overview • An Optimized Interaction Strategy for Bayesian Relevance Feedback, I. J. Cox, M. L. Miller, T. Minka, P. N. Yianilos, IEEE International Conference on Computer Vision and Pattern Recognition - CVPR '98, Santa Barbara, CA, pp. 553-558, 1998. • Introduces an improved stochastic image display strategy allowing the system to ``ask better questions.'' • Psychophysical Studies of the Performance of an Image Database Retrieval System, T. Papathomas, T. Conway, I. Cox, J. Ghosn, M. Miller, T. Minka, P. Yianilos, Proceedings of the Human Vision & Electric Imaging III, San Jose, CA Vol 3299, pp. 591-602, January 1998 • Describes Psychophysical studies of the system in a controlled environment.
Project summary • The Bayesian Image Retrieval System, PicHunter: Theorgy, Implementation and Psychophysical Experiements, I. J. Cox, M. L. Miller, T. P. Minka, T. V. Papathomas, P. N. Yianilos, IEEE Transactions on Image Processing, 9, 1, 20-37, (2000)
Introduction • A search consists of • Query • Repeated relevance feedback • To date, emphasis on query phase • better representations, relevance feedback crude or non-existent • Lack of quantitative measures for comparing performance of search algorithms
The main ideas • Bayesian relevance feedback • Learn from human interactions • Model the user's actions, not his/her query • Quantifiable testing • Target testing • Baseline testing • Optimize the image display
Target testing • The user is shown an image from the database. His/her task is to use the system to find it. We measure the number of interactions required. This, then, is easily compared against a simple linear search • Not a perfect model for all intended uses --- but something we can measure and use for comparisons
Features • Pictorial features • Originally 18 global features • % of pixels that are one of 11 colors • Mean color saturation • Median intensity of the image • Image width and height • A measure of global contrast • Two measures of the number of edgels computed at different thresholds
Features • Hidden annotation • Provides semantic labels • 147 attributes • Boolean vector, normalized Hamming distance
Bayesian relevance feedback • At denotes the current user action, • Dt is thecurrent display • H the session history including the current images displayed. Thus, • Ht = {D1, A1, D2, A2,… Dt, At} • T is a target image.
Bayesian relevance feedback • We build a predictive model P(A|T,H) • Then from Bayes rule
Bayesian relevance feedback • Assume time-invariance and same for all users
Absolute-distance model • Only one image, Xq, in the display Dt can be selected at each iteration • The probability of Ti increases or decreases depending on the distance d(Ti, Xq) • P(T=Ti) = P(T=Ti) G(d(Ti, Xq))
Relative-distance model • Let Q={Xq1, Xq2,…XqC} denote the set of selected in images in display Dt and • Let N={Xn1, Xn2 …XnL} denote the set of unselected images • Then we compute the distance difference • d(Ti, Xqk) – d(T1,Xnm) for all pairs {Xqk, Xnm} • The probabilities of images Tc that are closer to Xqk are increased while those closer to Xnm are decreased.
Display updating algorithm • Most probable display • Most informative display (Max. mutual information) • Sampling • Query by example
Experimental setup • Database of 4522 images • 1500 annotated • M/N, A/R, P/S/B • Memory/ no memory (relevance feedback history) • Absolute / relative distance • Pictorial / semantic/ both features
Experimental notation • MRB – memory, relative distance, pictorial and semantic features • MAB – memory, absolute distance, pictorial and semantic features • NRB – no memory, … • NAB • MRS – memory, relative, semantic features • MRP – memory, relative, pictorial features
Baseline testing • Similarity testing • How many images are examined before the user sees a similar image? • Compare to number needed when randomly searching the database
Improved pictorial features • Pictorial features • HSV 64-element histogram • HSV 256-element autocorrelogram • RGB 128-element color coherence vector
Display updating algorithms • Most probable display • Most informative display (Max. mutual information) • Sampling • Query by example
Most Probable Display • Performs quite well • However, greed strategy suffers from “over-learning” • PicHunter “gets stuck” in a local maximum • Display after display of “lions”, say
Most-Informative Display • Try to minimize the total number of iterations required in a search • Try to elicit as much information from the user as possible • Information theory suggests entropy as an estimate of the number of questions one needs to ask to resolve the ambiguity
Most-informative display • Consider the ideal (deterministic) case, in which the display consists of two images
Most-informative display • Generalization to the non-deterministic case
Most informative display • To perform minimization is non-trivial • Perform Monte Carlo simulation • Draw random displays {X1, X2… XND} from the distribution P(T=Ti) • Sampling is a special case of most informative method where only one Monte Carlo sample is drawn
Future directions • More efficient algorithms • Automatic detection of hidden features • Explore slightly richer user interfaces • Explore increased use of online learning