630 likes | 831 Views
Multimedia Database Systems. Department of Informatics Aristotle University of Thessaloniki Fall 2008. Relevance Feedback. Outline. Motivation for Relevance Feedback (RF) Introduction to RF RF techniques in Image Databases (5 techniques are studied) Other RF techniques Conclusions
E N D
Multimedia Database Systems Department of Informatics Aristotle University of Thessaloniki Fall 2008 Relevance Feedback
Outline • Motivation for Relevance Feedback (RF) • Introduction to RF • RF techniques in Image Databases (5 techniques are studied) • Other RF techniques • Conclusions • Bibliography
Motivation • Initial work on content-based retrieval focusedon using low-level features like color and texture for imagerepresentation. • After each image is associated with a featurevector, similarity between images is measured by computingdistances between feature vectors in the feature space. • It is generally assumed that the features are able to locatevisually similar images close to each other in the featurespace so that non-parametric approaches, like the k-nearestneighbor search, can be used for retrieval.
Motivation • There are cases where the user is not satisfied by the answers returned. • Several relevant objects may not be retrieved or in addition to the relevant objects there are a lot of non-relevant ones. • Possible solutions: • Request more answers (e.g., next 10) • Rephrase and reexecute the query • Relevance feedback
A Possible Solution: RF • Take advantage of user relevance judgments in the retrieval process: • User issues a query and gets back an initial hit list • User marks hits as relevant or non-relevant • The system computes a better representation of the information need based on this feedback • This process can be repeated more than once. Idea: you may not know what you’re looking for, but you’ll know when you see it.
Forms of RF • Explicit feedback: users explicitly mark relevant and irrelevant documents • Implicit feedback: system attempts to infer user intentions based on observable behavior • Blind feedback (also known as pseudofeedback): feedback in absence of any evidence, explicit or otherwise
The Goal of RF Initial query x x x x o x x x x x x x o x o x x o x o o x x x x Revised query x non-relevant objects o relevant objects
RF in Text Retrieval • RF was originally proposed for text-based information retrieval. • The goal is to improve the quality of the returned documents. • Fundamental work: Rocchio
Rocchio Method • Used in practice: • New query • Moves toward relevant objects • Away from irrelevant objects qm = modified query vector; q0 = original query vector; α,β,γ: weights (hand-chosen or set empirically); Dr = set of known relevant doc vectors; Dnr = set of known irrelevant doc vectors
Rocchio Example Typically, < 0 4 0 8 0 0 0 4 0 8 0 0 Original query (+) 2 4 8 0 0 2 1 2 4 0 0 1 Positive Feedback (-) Negative feedback 8 0 4 4 0 16 2 0 1 1 0 4 New query -1 6 3 7 0 -3
Some RF Techniques • Yong Rui, Thomas S. Huang and Sharad Mehrotra. “Content-Based Image Retrieval with Relevance Feedback in MARS”, International Conference on Image Processing (ICIP), 1997. • Selim Aksoy, Robert M. Haralick,Faouzi A. Cheikh, Moncef Gabbouj. “AWeighted Distance Approach to Relevance Feedback”, International Conference on Pattern Recognition (ICPR), 2000. • Zhong Su, Hongjiang Zhang, Stan Li, and Shaoping Ma. “Relevance Feedback in Content-Based ImageRetrieval: Bayesian Framework, Feature Subspaces,and Progressive Learning”, IEEE Transactions on Image Processing, 2003. • DeokHwan Kim, ChinWan Chung. “Qcluster: Relevance Feedback Using Adaptive Clustering for ContentBased Image Retrieval”, SIGMOD, 2003. • Junqi Zhang Xiangdong ZhouWei Wang Baile Shi1 Jian Pei. “Using High Dimensional Indexes to Support RelevanceFeedback Based Interactive Images Retrieval”, VLDB, 2006.
CBIR with RF in MARS • Thereis an urgent need to develop integration mechanisms to linkthe image retrieval model to text retrieval model, such thatthe well established text retrieval techniques can be utilized. • This paper studies approaches of converting image feature vectors (Image Processing domain) to weighted-term vectors (IR domain). • Furthermore, the relevance feedback techniquefrom the IR domain is used in content-based image retrievalto demonstrate the effectiveness of this conversion. • Experimental results show that the image retrieval precision increases considerably by using the proposed integration approach. • The method has been implemented in the MARS prototype system developed at the University of Illinois @Urbana Campaign.
Weighted Distance Approach Selim Aksoy, Robert M. Haralick,Faouzi A. Cheikh, Moncef Gabbouj. AWeighted Distance Approach to Relevance Feedback, Proceedings of International Conference on Pattern Recognition (ICPR), 2000.
Weighted Distance Approach number of iterations number of features in feature vector retrieval set after the k-th iteration set of objects in marked as relevant values of the j-th feature component of images in values of the j-th feature component of images in
Weighted Distance Approach • The similarity between images ismeasured by computing distances between feature vectorsin the feature space. • Given two feature vectors x and y andthe weight vector w, we use the weighted distances L1 or L2:
Weighted Distance Approach • From the pattern recognition point of view, for a featureto be good, its variance among all the images in the databaseshould be large but its variance among the relevant imagesshould be small. • Any one of these is not enough alone butcharacterizes a good feature when combined with the other.
Weighted Distance Approach Let denote the weight of the j-th feature component in the k+1 iteration. This weight is given by the following equation: where:
Weighted Distance Approach According to the values of and there are four different cases: best case worst case
Weighted Distance Approach Case 1 • Whenis large andis small, becomes large. • This means that the feature has a diverse set of valuesin the database but its values for relevant images aresimilar. • This is a desired situation and shows that thisfeature is very effective in distinguishing this specificrelevant image set, so a large weight assigns more importanceto this feature.
Weighted Distance Approach Case 2 • When bothandare large, is close to 1. • This means that the feature may have good discriminationcharacteristics in the database but is not effectivefor this specific relevant image group. • The resultingweight does not give any particular importance to thisfeature.
Weighted Distance Approach Case 3 • When bothand are small, is again closeto 1. • This is a similar but slightly worse situation thanthe previous one. • The feature is not generally effectivein the database and is not effective for this relevant seteither. • No importance is given to this feature.
Weighted Distance Approach Case 4 • Whenis small and is large, becomes small. • This is the worst case among all the possibilities. • Thefeature is not generally effective and even causes thedistance between relevant images to increase. • A smallweight forces the distance measure to ignore the effectof this feature.
Weighted Distance Approach Retrieval Algorithm [1] initialize all weights uniformly. [2] compute j = 1, 2, …, Q. [3] for k = 1, k <= K, k++ - search the DB using and retrieve - get feedback from user and populate - compute j = 1, 2, …, Q - compute j = 1, 2, …, Q - normalize j = 1, 2, ..., Q
Weighted Distance Approach Precision results
Weighted Distance Approach Precision results
Bayesian Classification: a short tutorial The problem of classification Given a number of classes and an unclassified object x determine the class that x belongs to. Examples: • given the age and the income of a person determine if she will buy a laptop or not • given the color, type and origin of car determine if it will be stolen • given the age, income and job of a person determine if the bank will give a loan or not. In Bayesian Classification we use probabilities to determine the “best” class to assign a new item.
Bayesian Classification: a short tutorial Bayes Theorem x data item h hypothesis P(h) prior probability of hypothesis h P(x) evidence of training data x P(h | x) probability of h given x (posterior probability) P(x | h) probability of x given h (likelihood)
Bayesian Classification: a short tutorial Each data item x is composed of several attributes. In our example we are interested in determining if a car with specific characteristics will be stolen or not. Car attributes: color, type and origin. Given a color, type, origin triplet we are interested in determining if the car will be stolen or not. (Thanks to Eric Meisner for this example)
Bayesian Classification: a short tutorial Training data
Bayesian Classification: a short tutorial Naive Bayesian Classification It is evident that each data item has several attributes (color, type, origin in our example). To calculate P(x | h) we use the independence assumption among different attributes. Therefore:
Bayesian Classification: a short tutorial The number of available classes are known. For each class ωi a discriminant functiongi(x) is defined. Item x is assigned to the k-th class when gk(x) > gj(x) for all j <> k. In Bayesian Classification gi(x) is set to P(ωi| x).
Bayesian Classification: a short tutorial Since for an item x the value of P(x) does not depend on the class, it can be eliminated without affecting the class ranking. Logarithms may also be used as follows: (e.g., with Gaussian classifiers)
Bayesian Classification: a short tutorial h1: the car will be stolen (1st hypothesis) h2: the car will NOT be stolen (2nd hypothesis) x: a red domestic sports car (color=“red”, type=“sports”, origin=“domestic”) Determine if the car will be stolen or not by using a Naive Bayesian Classifier.
Bayesian Classification: a short tutorial We need to calculate the following quantities: P(h1) : probability that a car will be stolen regardless of color, type and origin (prior probability) P(h2) : probability that a car will not be stolen regardless of color, type and origin (prior probability) P(x) : probability that a car from our set of cars is a red domestic sports car (evidence) P(x | h1) : probability that the car has a red color, it is domestic and it is a sports car, given that it is stolen (likelihood) P(x | h2) : probability that the car has a red color, it is domestic and it is a sports, given that it is not stolen (likelihood) P(h1 | x): probability that car x will be stolen given that we know its color, type and origin (posterior probability). P(h2 | x): probability that car x will not be stolen given that we know its color, type and origin (posterior probability).
Bayesian Classification: a short tutorial P(h1) = 5/10 = 0.5 P(h2) = 5/10 = 0.5 P(x) = 2/10 = 0.2 P(x | h1) = P(color=“red” | h1) x P(type=“sports” | h1) x P(origin=“domestic” | h1) = = 3/5 x 4/5 x 2/5 = 0.192 P(x | h2) = P(color=“red” | h2) x P(type=“sports” | h2) x P(origin=“domestic” | h2) = = 2/5 x 2/5 x 3/5 = 0.096 We need to calculateP(h1 | x) and P(h2 | x)
Bayesian Classification: a short tutorial By substitution and calculations we get: Since 0.096 > 0.048 we conclude that probably the car will be stolen.
RF with Bayesian Estimation Zhong Su, Hongjiang Zhang, Stan Li, and Shaoping Ma. “Relevance Feedback in Content-Based ImageRetrieval: Bayesian Framework, Feature Subspaces,and Progressive Learning”, IEEE Transactions on Image Processing, 2003.
RF with Bayesian Estimation • In the proposed relevance feedback approach, positive andnegative feedback examples are incorporated in the query refinementprocess with different strategies. • To incorporate positivefeedback in refining image retrieval, we assume that allof the positive examples in a feedback iteration belong to thesame semantic class whose features follow a Gaussian distribution.Features of all positive examples are used to calculate andupdate the parameters of its corresponding semantic Gaussianclass and we use a Bayesian classifier to re-rank the images inthe database. • To incorporate negative feedback examples, weapply a penalty function in calculating the final ranking of animage to the query image. That is, if an image is similar to anegative example, its rank will be decreased depending on the degreeof the similarity to the negative example.
RF with Bayesian Estimation Low-level features used
Classifier Details • The Gaussian density is often used for characterizing probabilitybecause of its computational tractability and the fact thatit adequately models a large number of cases. • The probability density function of the Gaussian distribution is: univariate (x assumes single values) 2 x: variable μ: mean σ: standard deviation - m æ ö 1 x 1 - ç ÷ s = 2 è ø p ( x ) e p s 2 multivariate (x is a vector) x: variable d: dimensionality μ: mean Σ: d x d cov matrix |Σ|: determinant of Σ 1 1 - t 1 - - m S - m ( x ) ( x ) = p(x) e 2 ( ) d / 2 p S 1 / 2 2 | |
1 1 ( ) - = - - m S - m - S + w 1 t g ( x ) ( x ) ( x ) ln | | ln P ( ) i i i i i i 2 2 Classifier Details • Recall the classifier functions: • Assuming the measurements are normally distributed, we have: • By substitution we get:
Positive Feedback D collection of images in the database n number of positive examples Q query image m number of feature types used feature vector for the i-th feature type dimensionality of the i-th feature type An image is represented as a vector
Positive Feedback Each feature type is Gaussian distributed. is the ni x ni covariance matrix is the ni dimensional mean vector
Positive Feedback • It isreasonable to assume that all the positive examples belong tothe class of images containing the desired object or semanticmeaning and the features of images belonging to the semanticclasses obey the Gaussian distribution. • The parameters for asemantic Gaussian class can be estimated using the featurevectors of all the positive examples. • Hence, the image retrievalbecomes a process of estimating the probability of belongingto a semantic class and the query refinement by relevancefeedback becomes a process of updating the Gaussian distributionparameters.
Positive Feedback U: the set of positive examples in the current iteration |U|: number of positive examples in the current iteration u: an item in U The update of the Gaussian parameters is performed as follows: n, n’: total number of positive examples accumulated before and after the iteration