Support Vector Machine Active Learning for Image Retrieval

Support Vector Machine Active Learning for Image Retrieval Author: Simon Tong & Edward Chang Presented By: Navdeep Dandiwal 800810102

Content • Motivation • Introduction • SVM • Version Space • Active Learning • Image Characterization • Experimental Data • Conclusions

Motivation • Relevance feedback is often a critical component when designing image databases. • Interactively determines a user’s desired output by asking user to label images Can it get boring for the user? • How to create effective relevance feedback?

Abstract The proposed use of Support Vector Machine Active Learning Algorithm is:- • Effective relevance feedback by grasping user’s query concept accurately and quickly, while asking to label small number of images • Selects the most informative images to query a user • Quickly learns boundary that separates the images that satisfy the user‘s query concept from the rest of the dataset

Introduction • User should be able to implicitly inform a database of his or her desired output or query concept • Relevance feedback can be used as a query Refinement Scheme to learn user query concept • Based on the answers, another image set is brought up for user to label We call such refinement scheme as query concept learner

Introduction(contd…) Refinement Scheme Fetches few image instances User labels each instance Relevant images Irrelevant images

Introduction(contd…) • Most machine learning algorithms are passive • Passive in the sense that they are generally applied using randomly selected training set • Key idea of active learning • It should be able to choose its next pool-query based upon the past answers to previous pool-queries

Introduction(contd…) Support Vector Machine Active Learner(SVMActive). It works on following ideas: • Similar to learning SVM binary classifier where a hyperplane separates relevant and irrelevant images in a projected space. • Learns the classifier quickly via active learning • Returns top-k most relevant images. These are the ones farthest from the hyperplane

Support Vector Machines • In their simplest forms, SVMs are hyperplanes that separate the training data by maximal margin • All vectors on one side of hyperplane are labeled as ‘-1’ and on the other side as ‘1’ • Training instances that lie closest to the hyperplanes are called support vectors

Support Vector Machines(contd…) - Support vectors Given training data {x1 . . . xn} that are vectors in some space We also give their labels {y1 . . . yn} where yi {-1,1}

Support Vector Machines(contd…) • SVMs allow one to project the original training data in space to a higher dimensional feature space via a Mercers kernel operator K. • When we classify x as +1, otherwise as -1

Support Vector Machines(contd…)

Support Vector Machines(contd…) • When K satisfies Mercer’s condition it can be written as and “.” denotes inner product. We can write f as: • Thus by using K we are implicitly projecting the training data into a different (often higher dimensional) space F

Version Space

Version Space(contd…) • Given a labeled training data and a Mercer kernel K , then the set of consistent hyperplanes that separate the data in the induced feature space is called the version space • Our set of possible hypothesis is given as: • Where parameter space is simply equal to

Version Space(contd…) • The version space, is defined as: • Notice that is a set of hyperplanes, there is a exact correspondence between unit vectors w and hypothesis f in . Thus we will redefine as:

Version Space(contd…) • SVMs find the hyperplane that maximizes the margin in feature space . One way to pose this as follows: subject to: Cause solution to lie in version space

Version Space(contd…) • We want to find the point in the version space that maximizes the minimum distance to any of the delineating hyperplanes. Largest sphere whose center lies in version space and whose surface does not intersect with the hyperplanes. It’s center corresponds to SVM and radius is the margin of SVM in feature space

Active Learning • In pool based active learning we have a pool of unlabeled instances • Instances x are independently and identically distributed according to underlying function F(x) • Labels are distributed according to some conditional distribution P(y|x)

Active Learning(contd…) • Given unlabeled pool U Active learner l:(f, q, X) Classifier f Querying function q(X) Labeled data X Given current labeled set X, decides which instance in U to query next Can also return a classifier f after each or fixed number of pool-queries Difference between active and passive classifier

Active Learning(contd…) • How to choose the next unlabeled instance in the pool to query? • Use approach that queries points so as to attempt to reduce the size of the version space as much as possible

Active Learning(contd…) -The surface of the hypersphere represents unit weight vectors -Each of the two hyperplanes corresponds to a labeled training instance -Version space is the surface segment closest to the camera

Active Learning(contd…) -A large sphere could be embedded -The center of this sphere lies in version space and surface does not intersect with the hyperplanes -Center is SVM, radius is margin

Active Learning(contd…) • Reduce version space as fast as possible by choosing a pool-query that halves V Next pool-query Unlabeled instances Labeled instances Largest hypersphere that fits inside version space wi SVM Version space

Active Learning(contd…) • SVMActive takes simple approach chooses pool query of twenty images closest to its separating hyperplane • It can be unstable during first round of RF • Therefore choose random images for the first round

SVMActive Algorithm Learn SVM on current labeled data Is it first feedback round? yes no Ask user to label 20 pool images closest to SVM boundary Ask user to label 20 randomly selected images After relevance feedback rounds Learn final SVM on labeled data Display top-k relevant images, farthest from SVM

Image Characterization • Our system employs a multi-resolution image representation scheme. • In this scheme, we characterize images by two main features: • Color • Texture • We consider shape as the attribute of these main features

Image Characterization(contd…) Multi-resolution Color Features

Image Characterization(contd…) Multi-resolution Texture Features • Three characterizing texture features: • Structuredness • Orientation • Scale • Discrete Wavelet Transformation (DWT) using quadrature mirror filters because of its computational efficiency

Image Characterization(contd…) Multi-resolution Texture Features

Image Characterization(contd…) Color Texture 144 dimensional vector • Space for SVMActive is a 144 dimensional space • Each image in database corresponds to a point in this space extraction

Experiments • 4-category; 10-category; 15-category datasets • To enable objective measure of performance, it is assumed that a query concept was an image category • Accuracy is computed by looking at the fraction of the k returned result that belongs to the target image category • All SVM algorithms require at least one relevant and one irrelevant image to function

Experiments(contd…) 4-category set 10-category set 15-category set

Experiments(contd…) • SVMActive displays 20 images per pool-querying round • The trade-off Number of images in one round Performance fewer lower Number of querying rounds Keeping it constant

Experiments(contd…) 20 random + 2 rounds of 10 20 random + 1 rounds of 20 vs Because active learner has more control and freedom to adapt when asking two rounds of 10 images than one round of 20 images 20 random + 2 rounds of 10 20 random + 2 rounds of 20 vs -Increase in cost of asking 20 images per round to user is negligible, since user can pick out relevant images easily -Virtually no additional computational cost in calculating the 20 images to query

Experiments(contd…) • SVMActive displays 20 images per pool-querying round • The trade-off Number of images in one round Performance increase Number of querying rounds Keeping it constant conduct more rounds

Experiments(contd…) Active and regular passive learning on 15-category dataset After three rounds of querying After five rounds of querying

Experiments(contd…) Average top-50 accuracy over the 4-category data set using a regular SVM trained on 30 images Accuracy on 4-category data set after three querying rounds using various kernels

Experiments(contd…) Scheme comparison Other Schemes(QPM; QEX) • Traditional information retrieval schemes require a large number of image instances to achieve any substantial refinement • Tend to be fairly localized in their exploration of the image space and hence rather slow in exploring the entire space

Experiments(contd…) Scheme comparison SVMActive • During relevance feedback, it takes both the relevant and irrelevant images into account when choosing the next pool-queries • Chooses to ask user to label images that are regarded most informative for leaning the query concept, rather than relying on the likelihood of being relevant

Experiments(contd…) Average top-k accuracy over the 15-category dataset

Conclusions In a nut shell the contributions of this study are: • SVMActivecan produce a well suited learner that significantly outperforms traditional methods • Organizing image features in different resolutions gives learner the flexibility to model subjective perception and to satisfy a variety of search tasks.

Conclusions(contd…) • Running time of SVMActive algorithm scales linearly with the size of image database. Solution: • Subsampling databases – using few thousand images as pool with which to query user • Designing methods to seed algorithms • It would be beneficial to make SVMActive independent of having a starting relevant image

Resources • http://courses.cms.caltech.edu/cs101.2/slides/cs101.2-09-svm-active-learning.pdf • http://airccse.org/journal/sipij/papers/3112sipij04.pdf

Thank you

Support Vector Machine Active Learning for Image Retrieval