Fast IKSVM or How to make an intersection kernel SVM about as fast as a linear SVM

Fast IKSVM orHow to make an intersection kernel SVM about as fast as a linear SVM This is work at U.C. Berkeley with Subhransu Maji and Jitendra Malik Alex Berg Yahoo! Research & U.C. Berkeley

As D.A.F. just said: “Everyone sticks things into bins…” Fast IKSVM orHow to make an intersection kernel SVM about as fast as a linear SVM This is work at U.C. Berkeley with Subhransu Maji and Jitendra Malik Alex Berg Yahoo! Research & U.C. Berkeley

Classification Methods for Detection/Recognition Boosted Decision Tree/ Cascade SVM fast training (relatively) fast evaluation for linear SVM Non-linear decision boundary (with a kernel) PRO very fast evaluation slow evaluation (with a kernel) CON slow training

Classification Methods for Detection/Recognition Boosted Decision Tree/ Cascade SVM fast training (relatively) fast evaluation for linear SVM Non-linear decision boundary (with a kernel) PRO very fast evaluation Viola, Jones Faces, Pedestrians, etc. Torralba et al multi-class w/ shared features (very fast eval.) slow evaluation (with a kernel) CON slow training Dalal & Triggs Ped. detection Ramanan et al Excellent on Pascal* (both quite fast)

Classification Methods for Detection/Recognition Boosted Decision Tree/ Cascade SVM fast training (relatively) fast evaluation for linear SVM Non-linear decision boundary (with a kernel) PRO very fast evaluation Viola, Jones Faces, Pedestrians, etc. Torralba et al multi-class w/ shared features (very fast eval.) slow evaluation (with a kernel) CON slow training Varma et al -- excellent Caltech 101/256 Class. Chum et al -- excellent Pascal VOC 2007 Detection Both based partially on -Chi squared on histograms from Bosch et al among other kernels very similar in performance to Intersection Kernel used by Grauman and Lazebnik Dalal & Triggs Ped. detection Ramanan et al Excellent on Pascal* (both quite fast)

Why kernalized SVMs are slow to evaluate

Why kernalized SVMs are slow to evaluate Feature vector to evaluate

Why kernalized SVMs are slow to evaluate Feature vector to evaluate Sum over all support vectors

Why kernalized SVMs are slow to evaluate Feature vector to evaluate Sum over all support vectors Kernel Evaluation

Why kernalized SVMs are slow to evaluate Feature corresponding to a support vector l Feature vector to evaluate Sum over all support vectors Kernel Evaluation

Why kernalized SVMs are slow to evaluate Feature corresponding to a support vector l Feature vector to evaluate Sum over all support vectors Kernel Evaluation Cost is: # Suport Vectors x Cost of kernel computation

Why kernalized SVMs are slow to evaluate for the Intersection Kernel Cost of standard implementation: # Suport Vectors x Dimension m x n

The Trick

The Trick Can exchange sums

The Trick Can exchange sums Find index r of largest xlless than x Then by sorting the xl(i)

The Trick Can exchange sums Find index r of largest xlless than x Then by sorting the xl(i) Cost of finding r using binary Search, 2 lookups a multiply and an add…

n x m n Log m The Trick Can exchange sums Find index r of largest xlless than x Then by sorting the xl(i) Cost of finding r using binary Search, 2 lookups a multiply and an add…

The Approximation • The hi are piecewise linear functions with • # pieces = number of support vectors - 1 • - Approximate with piecewise linear but, • use uniform spacing so avoid looking up the rank • Also can get away with many fewer segments than • the number of support vectors -> BIG memory savings

INRIA Pedestrian Detection Benchmark We introduce a novel feature, somewhat (simpler than HOG): Multi-level histogram of oriented energy Used with IKSVM Matches the best Using linear SVM does quite bit worse than anything shown here Dalal uses linear SVM with more carefully designed feature

Timing Results Time to evaluate 10,000 feature vectors

Results Time to evaluate 10,000 feature vectors

What is the piece-wise linear approximate IKSVM? • The piecewise linear approximate IKSVM is a strict generalization of a linear classifier: Sum of piecewise linear functions instead of linear functions. • We are training with SVM training for an IKSVM, but it could also be trained directly. This added flexibility (weights are not tied together by support vectore) might be useful for very large training sets.

Conclusions • Exactly evaluate IKSVM in O(n log m) as opposed to O(nm) • Makes SV cascade or other ordering shemes irrelevant for intersection kernel • Approximate any function that decomposes by coordinate (including Chi Squared) • In experiments sufficient to use 30-50 linear segments vs one for each SV • Verified that IKSVM can offer calssification performance advantages over linear • Showed that relatively simple features with IKSVM beat Dalal & Triggs and match the best part based approach to pedestrian detection.

Advertisement for some work at Yahoo! Research with Tamara Berg • Automatically find representative images of object categories: • Query flickr for 10^4 to 10^5 images with a tag, automatically find iconic images

Fast IKSVM or How to make an intersection kernel SVM about as fast as a linear SVM

Fast IKSVM or How to make an intersection kernel SVM about as fast as a linear SVM

Presentation Transcript

A Practical Guide to SVM

Methods To Overcome Acne As Fast As Possible

Computing as Fast as an Engineer can Think

Translating GLSAND to SVM

SVM and SVR as Convex Optimization Techniques

Introduction to SVM

Linear SVM

A tutorial about SVM

Masquerader detection using SVM with String Kernel

Introduction to Non-linear Support Vector Machine (SVM)

SVM Implementation

how to make money fast

How to make money Fast

How to Make Fast Money

kNN and SVM

An introduction to support vector machine (SVM)

Introduction to Non-linear Support Vector Machine (SVM)

How To Make More Money Fast

How to lose weight as fast as possible