Object Recognition with Informative Features and Linear Classification

Object Recognition with Informative Features and Linear Classification Authors: Vidal-Naquet & Ullman Presenter: David Bradley

Vs. • Image fragments make good features • especially when training data is limited • Image fragments contain more information than wavelets • allows for simpler classifiers • Information theory framework for feature selection

Intermediate complexity

What’s in a feature? • You and your favorite learning algorithm settle down for a nice game of 20 questions • Except since it is a learning algorithm it can’t talk, and the game really becomes 20 answers: • Have you asked the right questions? • What information are you really giving it? • How easy will it be for it to say “Aha, you are thinking of the side view of a car!” 10110010110000111001

“Pseudo-Inverse” • In general image reconstruction from features provides a good intuition of what information they are providing

Wavelet coefficients • Asks the question “how much is the current block of pixels like my wavelet pattern?” • This set of wavelets can entirely represent a 2x2 pixel block: • So if you give your learning algorithm all of the wavelet coefficients then you have given it all of the information it could possibly need, right?

Initial 2-feature Classifier Sometimes wavelets work well • Viola and Jones Face Detector • Trained on 24x24 pixel windows • Cascade Structure (32 classifiers total): • Initial 2-feature classifier rejects 60% of non-faces • Second, 5-feature classifier rejects 80% of non-faces

But they can require a lot of training data to use correctly • Rest of the Viola and Jones Face Detector • 3 20-feature classifiers • 2 50-feature classifiers • 20 200-feature classifiers • In the later stages it is tough to learn what combinations of wavelet questions to ask. • Surely there must be an easier way…

Image fragments • Represent the opposite extreme • Wavelets are basic image building blocks. • Fragments are highly specific to the patterns they come from • Present in the image if cross-correlation > threshold • Ideally if one could label all possible images (and search them quickly): • Use whole images as fragments • All vision problems become easy • Just look for the match

Dealing with the non-ideal world • Want to find fragments that: • Generalize well • Are specific to the class • Add information that other fragments haven’t already given us. • What metric should we use to find the best fragments?

Information Theory Review • Entropy: the minimum # of bits required to encode a signal Shannon Entropy Conditional Entropy

Mutual Information Entropy Class Conditional Entropy • I(C, F) = H(C) – H(C|F) • High mutual information means that knowing the feature value reduces the number of bits needed to encode the class Feature

Picking features with Mutual Information • Not practical to exhaustively search for the combination of features with the highest mutual information. • Instead do a greedy search for the feature whose minimum pair-wise information gain with the feature set already chosen is the highest.

Picking features with Mutual Information X2 X1 Low pair-wise information gain indicates variables are dependent Pick the most pair-wise independent variable X3 X4

Features picked for cars

The Details • Image Database • 573 14x21 pixel car side-view images • Cars occupied approx 10x15 pixels • 461 14x21 pixel non-car images • 4 classifiers were trained for 20 cross-validation iterations to generate results • 200 car and 200 non-car images in the training set • 100 car images to extract fragments from

Features • Extracted 59200 fragments from the first 100 images • 4x4 to 10x14 pixel image patches • Taken from the 10x15 pixel region containing the car. • Location restricted to a 5x5 area around original location • Used 2 scales of wavelets from the 10x15 region • Selected 168 features total

Classifiers • Linear SVM • Tree Augmented Network (TAN) • Models feature’s class dependency and biggest pairwise feature dependency • Quadratic decision surface in feature space

Occasional information loss due to overfitting

More Information About Fragments • Torralba et al. Sharing Visual Features for Multiclass and Multiview Object Detection. CVPR 2004. • http://web.mit.edu/torralba/www/extendedCVPR2004.pdf • ICCV Short Course (great matlab demo) • http://people.csail.mit.edu/torralba/iccv2005/

Objections • Wavelet features chosen are very weak • Images were very low resolution, maybe too low-res for more complicated wavelets • Data set is too easy • Side-views of cars have low intra-class variability • Cars and faces have very stable and predictable appearances • not hard enough to stress the fragment + linear SVM classifier, so TAN shows no improvement. • Didn’t compare fragments against successful wavelet application • Schneiderman & Kanade car detector • Do the fragment-based classifiers effectively get 100 more training images?

Object Recognition with Informative Features and Linear Classification

Object Recognition with Informative Features and Linear Classification

Presentation Transcript

OBJECT RECOGNITION

Object Recognition from Local Scale-Invariant Features

Object recognition

Object Recognition with Informative Features and Linear Classification

Object Tracking/Recognition using Invariant Local Features

Building text features for object image classification

Linear Classification with Perceptrons

Object Recognition

Object Recognition

Object Recognition

Insight 2+: Representations for object recognition and classification

Object Recognition with Invariant Features

Object recognition

Object Recognition

Object recognition

Object Recognition

Object recognition

Object Recognition with Invariant Features

Object Class Recognition Using Discriminative Local Features

Object Recognition

Insight 2+: Representations for object recognition and classification