1 / 26

Fast IKSVM or How to make an intersection kernel SVM about as fast as a linear SVM

Learn how to speed up your intersection kernel SVM to match linear SVM efficiency. This work done at U.C. Berkeley introduces a novel method with Subhransu Maji, Jitendra Malik, and Alex Berg from Yahoo! Research. Find out how to boost detection/recognition methods with boosted decision trees and cascade SVMs for fast training and evaluation. Discover how Viola, Jones, Torralba, Dalal & Triggs achieve excellent results in classification tasks. Uncover the challenges of kernelized SVMs and the trick to speeding up evaluation. Explore the benefits of the piecewise linear approximate IKSVM and its applications in various scenarios.

jimharris
Download Presentation

Fast IKSVM or How to make an intersection kernel SVM about as fast as a linear SVM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fast IKSVM orHow to make an intersection kernel SVM about as fast as a linear SVM This is work at U.C. Berkeley with Subhransu Maji and Jitendra Malik Alex Berg Yahoo! Research & U.C. Berkeley

  2. As D.A.F. just said: “Everyone sticks things into bins…” Fast IKSVM orHow to make an intersection kernel SVM about as fast as a linear SVM This is work at U.C. Berkeley with Subhransu Maji and Jitendra Malik Alex Berg Yahoo! Research & U.C. Berkeley

  3. Classification Methods for Detection/Recognition Boosted Decision Tree/ Cascade SVM fast training (relatively) fast evaluation for linear SVM Non-linear decision boundary (with a kernel) PRO very fast evaluation slow evaluation (with a kernel) CON slow training

  4. Classification Methods for Detection/Recognition Boosted Decision Tree/ Cascade SVM fast training (relatively) fast evaluation for linear SVM Non-linear decision boundary (with a kernel) PRO very fast evaluation Viola, Jones Faces, Pedestrians, etc. Torralba et al multi-class w/ shared features (very fast eval.) slow evaluation (with a kernel) CON slow training Dalal & Triggs Ped. detection Ramanan et al Excellent on Pascal* (both quite fast)

  5. Classification Methods for Detection/Recognition Boosted Decision Tree/ Cascade SVM fast training (relatively) fast evaluation for linear SVM Non-linear decision boundary (with a kernel) PRO very fast evaluation Viola, Jones Faces, Pedestrians, etc. Torralba et al multi-class w/ shared features (very fast eval.) slow evaluation (with a kernel) CON slow training Varma et al -- excellent Caltech 101/256 Class. Chum et al -- excellent Pascal VOC 2007 Detection Both based partially on -Chi squared on histograms from Bosch et al among other kernels very similar in performance to Intersection Kernel used by Grauman and Lazebnik Dalal & Triggs Ped. detection Ramanan et al Excellent on Pascal* (both quite fast)

  6. Why kernalized SVMs are slow to evaluate

  7. Why kernalized SVMs are slow to evaluate Feature vector to evaluate

  8. Why kernalized SVMs are slow to evaluate Feature vector to evaluate Sum over all support vectors

  9. Why kernalized SVMs are slow to evaluate Feature vector to evaluate Sum over all support vectors Kernel Evaluation

  10. Why kernalized SVMs are slow to evaluate Feature corresponding to a support vector l Feature vector to evaluate Sum over all support vectors Kernel Evaluation

  11. Why kernalized SVMs are slow to evaluate Feature corresponding to a support vector l Feature vector to evaluate Sum over all support vectors Kernel Evaluation Cost is: # Suport Vectors x Cost of kernel computation

  12. Why kernalized SVMs are slow to evaluate for the Intersection Kernel Cost of standard implementation: # Suport Vectors x Dimension m x n

  13. The Trick

  14. The Trick Can exchange sums

  15. The Trick Can exchange sums Find index r of largest xlless than x Then by sorting the xl(i)

  16. The Trick Can exchange sums Find index r of largest xlless than x Then by sorting the xl(i) Cost of finding r using binary Search, 2 lookups a multiply and an add…

  17. n x m n Log m The Trick Can exchange sums Find index r of largest xlless than x Then by sorting the xl(i) Cost of finding r using binary Search, 2 lookups a multiply and an add…

  18. The Approximation • The hi are piecewise linear functions with • # pieces = number of support vectors - 1 • - Approximate with piecewise linear but, • use uniform spacing so avoid looking up the rank • Also can get away with many fewer segments than • the number of support vectors -> BIG memory savings

  19. INRIA Pedestrian Detection Benchmark We introduce a novel feature, somewhat (simpler than HOG): Multi-level histogram of oriented energy Used with IKSVM Matches the best Using linear SVM does quite bit worse than anything shown here Dalal uses linear SVM with more carefully designed feature

  20. Timing Results Time to evaluate 10,000 feature vectors

  21. Results Time to evaluate 10,000 feature vectors

  22. Results Time to evaluate 10,000 feature vectors

  23. Results Time to evaluate 10,000 feature vectors

  24. What is the piece-wise linear approximate IKSVM? • The piecewise linear approximate IKSVM is a strict generalization of a linear classifier: Sum of piecewise linear functions instead of linear functions. • We are training with SVM training for an IKSVM, but it could also be trained directly. This added flexibility (weights are not tied together by support vectore) might be useful for very large training sets.

  25. Conclusions • Exactly evaluate IKSVM in O(n log m) as opposed to O(nm) • Makes SV cascade or other ordering shemes irrelevant for intersection kernel • Approximate any function that decomposes by coordinate (including Chi Squared) • In experiments sufficient to use 30-50 linear segments vs one for each SV • Verified that IKSVM can offer calssification performance advantages over linear • Showed that relatively simple features with IKSVM beat Dalal & Triggs and match the best part based approach to pedestrian detection.

  26. Advertisement for some work at Yahoo! Research with Tamara Berg • Automatically find representative images of object categories: • Query flickr for 10^4 to 10^5 images with a tag, automatically find iconic images

More Related