260 likes | 282 Views
Learn how to speed up your intersection kernel SVM to match linear SVM efficiency. This work done at U.C. Berkeley introduces a novel method with Subhransu Maji, Jitendra Malik, and Alex Berg from Yahoo! Research. Find out how to boost detection/recognition methods with boosted decision trees and cascade SVMs for fast training and evaluation. Discover how Viola, Jones, Torralba, Dalal & Triggs achieve excellent results in classification tasks. Uncover the challenges of kernelized SVMs and the trick to speeding up evaluation. Explore the benefits of the piecewise linear approximate IKSVM and its applications in various scenarios.
E N D
Fast IKSVM orHow to make an intersection kernel SVM about as fast as a linear SVM This is work at U.C. Berkeley with Subhransu Maji and Jitendra Malik Alex Berg Yahoo! Research & U.C. Berkeley
As D.A.F. just said: “Everyone sticks things into bins…” Fast IKSVM orHow to make an intersection kernel SVM about as fast as a linear SVM This is work at U.C. Berkeley with Subhransu Maji and Jitendra Malik Alex Berg Yahoo! Research & U.C. Berkeley
Classification Methods for Detection/Recognition Boosted Decision Tree/ Cascade SVM fast training (relatively) fast evaluation for linear SVM Non-linear decision boundary (with a kernel) PRO very fast evaluation slow evaluation (with a kernel) CON slow training
Classification Methods for Detection/Recognition Boosted Decision Tree/ Cascade SVM fast training (relatively) fast evaluation for linear SVM Non-linear decision boundary (with a kernel) PRO very fast evaluation Viola, Jones Faces, Pedestrians, etc. Torralba et al multi-class w/ shared features (very fast eval.) slow evaluation (with a kernel) CON slow training Dalal & Triggs Ped. detection Ramanan et al Excellent on Pascal* (both quite fast)
Classification Methods for Detection/Recognition Boosted Decision Tree/ Cascade SVM fast training (relatively) fast evaluation for linear SVM Non-linear decision boundary (with a kernel) PRO very fast evaluation Viola, Jones Faces, Pedestrians, etc. Torralba et al multi-class w/ shared features (very fast eval.) slow evaluation (with a kernel) CON slow training Varma et al -- excellent Caltech 101/256 Class. Chum et al -- excellent Pascal VOC 2007 Detection Both based partially on -Chi squared on histograms from Bosch et al among other kernels very similar in performance to Intersection Kernel used by Grauman and Lazebnik Dalal & Triggs Ped. detection Ramanan et al Excellent on Pascal* (both quite fast)
Why kernalized SVMs are slow to evaluate Feature vector to evaluate
Why kernalized SVMs are slow to evaluate Feature vector to evaluate Sum over all support vectors
Why kernalized SVMs are slow to evaluate Feature vector to evaluate Sum over all support vectors Kernel Evaluation
Why kernalized SVMs are slow to evaluate Feature corresponding to a support vector l Feature vector to evaluate Sum over all support vectors Kernel Evaluation
Why kernalized SVMs are slow to evaluate Feature corresponding to a support vector l Feature vector to evaluate Sum over all support vectors Kernel Evaluation Cost is: # Suport Vectors x Cost of kernel computation
Why kernalized SVMs are slow to evaluate for the Intersection Kernel Cost of standard implementation: # Suport Vectors x Dimension m x n
The Trick Can exchange sums
The Trick Can exchange sums Find index r of largest xlless than x Then by sorting the xl(i)
The Trick Can exchange sums Find index r of largest xlless than x Then by sorting the xl(i) Cost of finding r using binary Search, 2 lookups a multiply and an add…
n x m n Log m The Trick Can exchange sums Find index r of largest xlless than x Then by sorting the xl(i) Cost of finding r using binary Search, 2 lookups a multiply and an add…
The Approximation • The hi are piecewise linear functions with • # pieces = number of support vectors - 1 • - Approximate with piecewise linear but, • use uniform spacing so avoid looking up the rank • Also can get away with many fewer segments than • the number of support vectors -> BIG memory savings
INRIA Pedestrian Detection Benchmark We introduce a novel feature, somewhat (simpler than HOG): Multi-level histogram of oriented energy Used with IKSVM Matches the best Using linear SVM does quite bit worse than anything shown here Dalal uses linear SVM with more carefully designed feature
Timing Results Time to evaluate 10,000 feature vectors
Results Time to evaluate 10,000 feature vectors
Results Time to evaluate 10,000 feature vectors
Results Time to evaluate 10,000 feature vectors
What is the piece-wise linear approximate IKSVM? • The piecewise linear approximate IKSVM is a strict generalization of a linear classifier: Sum of piecewise linear functions instead of linear functions. • We are training with SVM training for an IKSVM, but it could also be trained directly. This added flexibility (weights are not tied together by support vectore) might be useful for very large training sets.
Conclusions • Exactly evaluate IKSVM in O(n log m) as opposed to O(nm) • Makes SV cascade or other ordering shemes irrelevant for intersection kernel • Approximate any function that decomposes by coordinate (including Chi Squared) • In experiments sufficient to use 30-50 linear segments vs one for each SV • Verified that IKSVM can offer calssification performance advantages over linear • Showed that relatively simple features with IKSVM beat Dalal & Triggs and match the best part based approach to pedestrian detection.
Advertisement for some work at Yahoo! Research with Tamara Berg • Automatically find representative images of object categories: • Query flickr for 10^4 to 10^5 images with a tag, automatically find iconic images