160 likes | 494 Views
Visual Categorization with Bags of Keypoints. Csurka et. al. Presenting – Anat Kaspi. The Problem. Generic Visual Categorization Identifying object content . The System. Images. Training. Testing. Vocabulary. SIFT. K-Means. Clusters. KeyDescriptors. KeyDescriptors.
E N D
Visual Categorization with Bags of Keypoints Csurka et. al. Presenting – Anat Kaspi
The Problem • Generic Visual Categorization • Identifying object content
The System Images Training Testing Vocabulary SIFT K-Means Clusters KeyDescriptors KeyDescriptors Counts how many points are close to cluster i Labels Feature Vectors Feature Vectors f SVM Learning SVM Testing Labels
The SIFT descriptors • Each keyDescriptor described by 128 elements vector • the vector containing the values of all orientation histogram entries
Second Stage – Vocabulary construction Using K-means clustering K-Means - Iterative algorithm • Divide the descriptors to clusters according to initial conditions (randomly) • Compute the center (mean) for each cluster • Assign the descriptors to their closest cluster’s center • Repeat this until no descriptor change its cluster
Vocabulary construction – Cont. • Extracted many keypoints from all training images(~500+~500+… = ~600,000 keypoints) • The vocabulary should be large enough to distinguish relevant changes in image parts, but no so large to distinguish irrelevant variation such as noise • Clusters all keypoints descriptors (using k-means) to clusters. Each cluster is a “word” in the vocabulary
Transforming a single image to a feature vector: Iivi • We have keyDescriptors from a single image (Ii) • For each keyDescriptor find the best matching cluster (the cluster with the closest center) • Feature vector (vi): for each cluster (1...1000) count how many of the keypoints are closest to it
Last Stage – Categorization by linear SVM The Positive train images All other images - negative train SVM returns the linear separator (hyperplane) with maximal margin
For each class collect some images to test For a new image Ii compute feature vector vi Use the learned function f to compute the predicted label f(vi) For each class collect many example images Iface(1)…Iface(~800) Icar(1)…Icar(~200) … For each image Ii compute the feature vector vi and its label (1 or -1) --> collection of feature vectors and their labels Train a multi class linear SVM to get a functionf(vi){face,car,bikes,buildings,phone,trees,books} Training Testing
My Experiment – Training Positive 10 Shoes
Training - Negative 4 Dogs 5 Faces 6 Phones 5 Cars 1 Fruit
Evaluating the Results Match – identify shoe as shoe and identify other objects as non shoe Mismatch – identify other objects as shoe and shoe as non shoe 86% recognized shoes 21% did not categorized well
Goals for final presentation • Enlarge the Database with challenging images • Add more then one class (multiple SVM’s) • Check the results dependence on k (the number of clusters) • Compare the SVM with other classifier – Naïve Bayes