190 likes | 351 Views
Learning Image Similarity from Flick r Groups Using Stochastic Intersection Kernel Machines ICCV 2009, UIUC. Gang Wang Derek Hoiem David Forsyth. OUTLINE. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION. Introduction.
E N D
Learning Image Similarity from Flickr Groups Using Stochastic Intersection Kernel MachinesICCV 2009, UIUC Gang Wang Derek Hoiem David Forsyth
OUTLINE INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION
Introduction Using online photo sharing sites → Flickr(Group) Determine which image are similar , how they are similar Learn these Group membership likelihoods Due to the time that it would take to learn categories Propose a new method forstochastic learning of SVMs using Histogram Intersection Kernel (HIK) SIKMA Combine with [14] and [18]
Construction Related work Algorithm classes (train very large scale kernel SVM) • Exploits the sparseness of the lagrange multipliers → SMO[22] • Use stochastic gradient descent without touching every example http://0rz.tw/BDHWJ Kivinen [14] → method applies to kernel machines Maji[18] → very quickly evaluating a histogram intersection kernel
Construction conclusion Flickr provide an organizational structure How people like to group SIKMA classifier allows efficient and accurate learning of these categories This property generalizes well Even the test dataset was not obtained from Flickr
Approach(SIKMA) Suppose we have a list of training examples For the test example u The classification score
Approach(SIKMA) Approximate the gradient by replacing the sum over all examples(batch) with a sum over some subset, chosen at random. It is usual to consider a single example. New decision function It’s expensive to calculate ft-1. The NORMA Algo.[14] keeps a set of support vectors of fixed length by dropping the oldest ones. Doing so comes at a considerable cost in accuracy !
Approach(SIKMA) D is feature dimension
Approach T: # of training example M: # of quantization bins D: # of feature dimension
Approach Measuring image similarity • Found a simple Euclidean distance between the SVM outputs. • Since we have names(groups), we can also perform text-based queries (get image like “people dancing”) and determine how two image are similar
Implement detail Use four type of feature: • SIFT feature Detect and describe local patches • Gist feature 960 dimensions Gist descriptor • Color feature RGB space, value range from 1 to 512 • Gradient feature The whole image is represented as a 256 dimensional vector Combine the outputs of these four classifier to be a final prediction on a validation data set
SIKMA Training Time and Test Accuracy For 103 Flickr categories, using 15,000 ~ 30,00 positive imagesand 60,000 negative images. The average AP over these categories is 0.433
Experimentsimage matching with Feedback Select top five negative examples and five randomly chosen positive examples from among the top 50 ranked images yi is 1 if it is positive, otherwise 0
Experimentstext-based queries Flickr category can be described with several word, we can support text-based queries. Input a word query find the Flickr group whose description contains such word Test this on the Corel data set, with two queries ”airplane” and “sunset”.
Conclusion SIKMA, an algorithm to quickly train an SVM with the histogram intersection kernel using tens of thousands of training examples two images that are likely to belong to the same Flickrgroups are considered similar. Experimental results show that matching with Prediction features better than matching with visual features