1 / 69

Sharing features for multi-class object detection

This study by Antonio Torralba & team aims for a machine to identify diverse objects worldwide through recognition and detection. The challenge lies in differentiating sparse objects from the background across various viewpoints. To address this, an approach is proposed that shares features across objects, enhancing efficiency, accuracy, and generalization. The algorithm seeks to find a shared vocabulary of parts, allowing for sub-linear scalability and better generalization. By utilizing boosting techniques and additive models for classification, the algorithm enables natural feature sharing among different objects, paving the way for more effective multi-class object detection solutions.

mhutchinson
Download Presentation

Sharing features for multi-class object detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sharing features for multi-class object detection Antonio Torralba, Kevin Murphy and Bill Freeman MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) Sept. 1, 2004

  2. The lead author: Antonio Torralba

  3. Goal • We want a machine to be able to identify thousands of different objects as it looks around the world.

  4. Desired detector outputs: Bookshelf Screen One patch Classifier P( car | vp) Local features no car Vp Desk Classifier P(person | vp) no person Classifier P( cow | vp) no cow … Multi-class object detection:

  5. Recognition: name each object In recognition, the problem is to discriminate between objects. Detection: localize in the image screens, keyboards and bottles In detection, the big problem is to differentiate the sparse objects against the background Need to detect as well as to recognize

  6. detecting a single object category and view: Viola & Jones 2001; Papageorgiou & Poggio 2000, … detecting particular objects: . Lowe, 1999 detecting objects in isolation Leibe & Schiele, 2003; There are efficient solutions for But the problem of multi-class and multi-view object detection is still largely unsolved.

  7. Object classes viewpoints Why multi-object detection is a hard problem Styles, lighting conditions, etc, etc, etc… Need to detect Nclasses * Nviews * Nstyles. Lots of variability within classes, and across viewpoints.

  8. Viola-Jones extension for dealing with rotations - two cascades for each view • Schneiderman-Kanade multiclass object detection a) One detector for each class Existing approaches for multiclass object detection (vision community) Using a set of independent binary classifiers is the dominant strategy:

  9. Fei-Fei, Fergus, & Perona, 2003 • Krempp, Geman, & Amit, 2002 Look for a vocabulary of edges that reduces the number of features Promising approaches…

  10. Characteristics of one-vs-all multiclass approaches: cost Computational cost grows linearly with Nclasses * Nviews * Nstyles … Surely, this will not scale well to 30,000 object classes.

  11. ~100% detection rate with 0 false alarms Parts derived from training a binary classifier. Characteristics of one-vs-all multiclass approaches: representation What is the best representation to detect a traffic sign? Very regular object: template matching will do the job Some of these parts can only be used for this object.

  12. Meaningful parts, in one-vs-all approaches Part-based object representation (looking for meaningful parts): • A. Agarwal and D. Roth • M. Weber, M. Welling and P. Perona … Ullman, Vidal-Naquet, and Sali, 2004: features of intermediate complexity are most informative for (single-object) classification.

  13. Multi-class classifiers (machine learning community) • Error correcting output codes (Dietterich & Bakiri, 1995; …) • But only use classification decisions (1/-1), not real values. • Reducing multi-class to binary (Allwein et al, 2000) • Showed that the best code matrix is problem-dependent; don’t address how to design code matrix. • Bunching algorithm (Dekel and Singer, 2002) • Also learns code matrix and classifies, but more complicated than our algorithm and not applied to object detection. • Multitask learning (Caruana, 1997; …) • Train tasks in parallel to improve generalization, share features. But not applied to object detection, nor in a boosting framework.

  14. Our approach • Share features across objects, automatically selecting the best sharing pattern. • Benefits of shared features: • Efficiency • Accuracy • Generalization ability

  15. Algorithm goals, for object recognition. We want to find the vocabulary of parts that can be shared We want share across different objects generic knowledge about detecting objects (eg, from the background). We want to share computations across classes so that computational cost < O(Number of classes)

  16. Object class 1 Object class 4 Object class 2 Object class 3 Independent features Total number of hyperplanes (features): 4 x 6 = 24. Scales linearly with number of classes

  17. Shared features Total number of shared hyperplanes (features): 8 May scale sub-linearly with number of classes, and may generalize better.

  18. 3 b 3 Note: sharing is a graph, not a tree Objects: {R, b, 3} R 3 b This defines a vocabulary of parts shared across objects

  19. At the algorithmic level • Our approach is a variation on boosting that allows for sharing features in a natural way. • So let’s review boosting (ada-boost demo)

  20. Boosting demo

  21. classes +1/-1 classification feature responses Joint boosting, outside of the context of images.Additive models for classification

  22. H1 = G1,2 + G1 H2 = G1,2 + G2 Feature sharing in additive models • Simple to have sharing between additive models • Each term hm can be mapped to a single feature

  23. Flavors of boosting • Different boosting algorithms use different loss • functions or minimization procedures • (Freund & Shapire, 1995; Friedman, Hastie, Tibshhirani, 1998). • We base our approach on Gentle boosting: learns faster than others • (Friedman, Hastie, Tibshhirani, 1998; • Lienahart, Kuranov, & Pisarevsky, 2003).

  24. At each boosting round, we add a function: JointBoosting We use the exponential multi-class cost function classes membership in class c, +1/-1 classifier output for class c

  25. Newton’s method Treat hm as a perturbation, and expand loss J to second order in hm classifier with perturbation squared error reweighting

  26. Joint Boosting Weight squared error over training data weight squared error

  27. a+b q vf b Given a sharing pattern, the decision stump parameters are obtained analytically For a trial sharing pattern, set weak learner parameters to optimize overall classification hm (v,c) Feature output, v

  28. The constants k prevent sharing features just due to an asymmetry between positive and negative examples for each class. They only appear during training. Response histograms for background (blue) and class members (red) kc=1 kc=2 hm(v,c) kc=5 Joint Boosting: select sharing pattern and weak learner to minimize cost. Algorithm details in CVPR 2004, Torralba, Murphy & Freeman

  29. Approximate best sharing • But this requires exploring 2C –1 possible sharing patterns • Instead we use a first-best search: • S = [] • 1) We fit stumps for each class independently • 2) take best class - ci • S = [S ci] • fit stumps for [S ci] with ci not in S • go to 2, until length(S) = Nclasses • 3) select the sharing with smallest WLS error

  30. Effect of pattern of feature sharing on number of features required (synthetic example)

  31. Effect of pattern of feature sharing on number of features required (synthetic example)

  32. 2-d synthetic example 3 classes + 1 background class

  33. No feature sharing Three one-vs-all binary classifiers This is with only 8 separation lines

  34. With feature sharing Some lines can be shared across classes. This is with only 8 separation lines

  35. The shared features

  36. Comparison of the classifiers Shared features: note better isolation of individual classes. Non-shared features.

  37. Location of that patch within the 32x32 object 12x12 patch Feature output gf(x) Mean = 0 Energy = 1 wf(x) Binary mask Now, apply this to images.Image features (weak learners) 32x32 training image of an object

  38. The candidate features position template

  39. The candidate features position template Dictionary of 2000 candidate patches and position masks, randomly sampled from the training images

  40. Database of 2500 images Annotated instances

  41. 21 objects Multiclass object detection We use [20, 50] training samples per object, and about 20 times as many background examples as object examples.

  42. Feature sharing at each boosting round during training

  43. Feature sharing at each boosting round during training

  44. Example shared feature (weak classifier) Response histograms for background (blue) and class members (red) At each round of running joint boosting on training set we get a feature and a sharing pattern.

  45. Shared feature Non-shared feature

  46. Shared feature Non-shared feature

  47. How the features were shared across objects (features sorted left-to-right from generic to specific)

  48. Performance evaluation Correct detection rate Area under ROC (shown is .9) False alarm rate

  49. Performance improvement over training Significant benefit to sharing features using joint boosting.

  50. ROC curves for our 21 object database • How will this work under training-starved or feature-starved conditions? • Presumably, in the real world, we will always be starved for training data and for features.

More Related