1 / 29

Similarity Metrics for Categorization: From Monolithic to Category Specific

Learn about the evolution of similarity metrics for categorization, from monolithic to category-specific approaches, discussing the need for meaningful similarity metrics and exploring the idea of Multiple Similarity Learning (MuSL) to optimize performance and avoid overfitting. Discover Boosting Similarity techniques and the MuSL Boosting algorithm for efficient training and grouping of categories.

hillh
Download Presentation

Similarity Metrics for Categorization: From Monolithic to Category Specific

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Similarity Metrics for Categorization:From Monolithic to Category Specific Boris Babenko, Steve Branson, Serge Belongie University of California, San Diego ICCV 2009, Kyoto, Japan

  2. Similarity Metrics for Categorization • Recognizing multiple categories • Need meaningful similarity metric / feature space

  3. Similarity Metrics for Categorization • Recognizing multiple categories • Need meaningful similarity metric / feature space • Idea: use training data to learn metric, plug into kNN • Goes by many names: • metric learning • cue combination/weighting • kernel combination/learning • feature selection

  4. Similarity Metrics for Categorization • Learn a single global similarity metric Query Image Similarity Metric Labeled Dataset Monolithic Category 1 Category 2 Category 3 [ Jones et al. ‘03, Chopra et al. ‘05, Goldberger et al. ‘05, Shakhnarovich et al. ‘05 Torralba et al. ‘08] Category 4

  5. Similarity Metrics for Categorization • Learn similarity metric for each category (1-vs-all) Query Image Similarity Metric Labeled Dataset Monolithic Category 1 Category 2 Category 3 [ Varma et al. ‘07, Frome et al. ‘07, Weinberger et al. ‘08 Nilsback et al. ’08] Category Specific Category 4

  6. How many should we train? • Per category: • More powerful • Do we really need thousands of metrics? • Have to train for new categories • Global/Monolithic: • Less powerful • Can generalize to new categories

  7. Multiple Similarity Learning (MuSL) • Would like to explore space between two extremes • Idea: • Group categories together • Learn a few similarity metrics, one for each super-category

  8. Multiple Similarity Learning (MuSL) • Learn a few good similarity metrics Query Image Similarity Metric Labeled Dataset Monolithic Category 1 Category 2 MuSL Category 3 Category Specific Category 4

  9. Review of Boosting Similarity • Need some framework to work with… • Boosting has many advantages: • Feature selection • Easy implementation • Performs well • Can treat metric learning as binary classification

  10. Notation • Training data: • Generate pairs: • Sample negative pairs Images Category Labels ( , ), 1 ( , ), 0

  11. Boosting Similarity • Train similarity metric/classifier:

  12. Boosting Similarity • Choose to be binary -- i.e. • = L1 distance over binary vectors • Can pre-compute for training data • Efficient to compute (XOR and sum) • For convenience: [Shakhnarovich et al. ’05, Fergus et al. ‘08]

  13. Gradient Boosting • Given some objective function • Boosting = gradient ascent in function space • Gradient = example weights for boosting chosen weak classifier current strong classifier other weak classifiers function space [Friedman ’01, Mason et al. ‘00]

  14. MuSL Boosting • Goal: train and recover mapping • At runtime • To compute similarity of query image touse Category 1 Category 2 Category 3 Category 4

  15. Naïve Solution • Run pre-processing to group categories (i.e. k-means), then train as usual • Drawbacks: • Hacky / not elegant • Not optimal: pre-processing not informed by class confusions, etc. • How can we train & group simultaneously?

  16. MuSL Boosting • Definitions: Sigmoid Function Parameter

  17. MuSL Boosting • Definitions:

  18. MuSL Boosting • Definitions: How well works with category

  19. MuSL Boosting • Objective function: • Each category “assigned” to classifier

  20. Approximating Max • Replace max with differentiable approx. where is a scalar parameter

  21. Pair Weights • Each training pair has weights

  22. Pair Weights • Intuition: Approximation of Difficulty of pair (like regular boosting)

  23. Evolution of Weights Difficult Pair Easy Pair Assigned to Assigned to (boosting iteration) (boosting iteration)

  24. MuSL Boosting Algorithm for for - Compute weights - Train on weighted pairs end end Assign

  25. MuSL Results • Created dataset with hierarchical structure of categories • Merged categories from: • Caltech 101 [Griffin et al.] • Oxford Flowers [Nilsback et al.] • UIUC Textures [Lazebnik et al.]

  26. Recovered Super-categories MuSL k-means

  27. Generalizing to New Categories New categories only Both new and old categories Training more metrics overfits!

  28. Conclusions • Studied categorization performance vs number of learned metrics • Presented boosting algorithm to simultaneously group categories and train metrics • Observed overfitting behavior for novel categories

  29. Thank you! • Supported by • NSF CAREER Grant #0448615 • NSF IGERT Grant DGE-0333451 • ONR MURI Grant #N00014-08-1-0638 • UCSD FWGrid Project (NSF Infrastructure Grant no. EIA-0303622)

More Related