Statistical Object Recognition

CS 636 Computer Vision Statistical Object Recognition Nathan Jacobs Slides adapted from Lazebnik

Administrivia • Project 4 • Final Project • Next Class is Cancelled

Overview • Statistical Recognition • generative vs. discriminative learning • Bag of Features Models

Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman

Steps for statistical recognition • Representation • Specify the model for an object category • Bag of features, part-based, global, etc. • Learning • Given a training set,find the parameters of the model • Generative vs. discriminative • Recognition • Apply the model to a new test image

Object categorization: the statistical viewpoint • MAP decision: vs.

posterior likelihood prior Object categorization: the statistical viewpoint • MAP decision: vs. • Bayes rule:

posterior likelihood prior Object categorization: the statistical viewpoint • Discriminative methods: model posterior • Generative methods: model likelihood and prior

Discriminative methods • Direct modeling of Decisionboundary Zebra Non-zebra

Generative methods • Model and

Generative vs. discriminative learning Generative Discriminative Posterior probabilities Class densities

Generative vs. discriminative methods • Generative methods + Can sample from them / compute how probable any given model instance is + Can be learned using images from just a single category – Sometimes we don’t need to model the likelihood when all we want is to make a decision • Discriminative methods + Efficient + Often produce better classification rates – Require positive and negative training data – Can be hard to interpret

Generalization • How well does a learned model generalize from the data it was trained on to a new test set? • Underfitting: model is too “simple” to represent all the relevant class characteristics • High training error and high test error • Overfitting: model is too “complex” and fits irrelevant characteristics (noise) in the data • Low training error and high test error • Occam’s razor: given two models that represent the data equally well, the simpler one should be preferred

Occam’s razor: why is it a useful heuristic? 1NN 5NN logistic regression (x, y, x2, y2) logistic regression (x, y, sqrt(x2+y2))

Supervision • Images in the training set must be annotated with the “correct answer” that the model is expected to produce Contains a motorbike

Fully supervised “Weakly” supervised Unsupervised Definition depends on task

Face Recognition • N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar, "Attribute and Simile Classifiers for Face Verification,"ICCV 2009.

Face Recognition Attributes for training Similes for training • N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar, "Attribute and Simile Classifiers for Face Verification,"ICCV 2009.

Face Recognition Results on Labeled Faces in the Wild Dataset • N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar, "Attribute and Simile Classifiers for Face Verification,"ICCV 2009.

What task? • Classification • Object present/absent in image • Background may be correlated with object • Localization / Detection • Localize object within the frame • Bounding box or pixel-level segmentation

Datasets • Circa 2001: 5 categories, 100s of images per category • Circa 2004: 101 categories • Today: thousands of categories, tens of thousands of images

Caltech 101 & 256 http://www.vision.caltech.edu/Image_Datasets/Caltech101/ http://www.vision.caltech.edu/Image_Datasets/Caltech256/ Griffin, Holub, Perona, 2007 Fei-Fei, Fergus, Perona, 2004

The PASCAL Visual Object Classes Challenge (2005-2009) http://pascallin.ecs.soton.ac.uk/challenges/VOC/ 2008 Challenge classes: Person: person Animal: bird, cat, cow, dog, horse, sheep Vehicle:aeroplane, bicycle, boat, bus, car, motorbike, train Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

The PASCAL Visual Object Classes Challenge (2005-2009) • Main competitions • Classification: For each of the twenty classes, predicting presence/absence of an example of that class in the test image • Detection: Predicting the bounding box and label of each object from the twenty target classes in the test image http://pascallin.ecs.soton.ac.uk/challenges/VOC/

The PASCAL Visual Object Classes Challenge (2005-2009) • “Taster” challenges • Segmentation: Generating pixel-wise segmentations giving the class of the object visible at each pixel, or "background" otherwise • Person layout: Predicting the bounding box and label of each part of a person (head, hands, feet) http://pascallin.ecs.soton.ac.uk/challenges/VOC/

Lotus Hill Research Institute image corpus http://www.imageparsing.com/ Z.Y. Yao, X. Yang, and S.C. Zhu, 2007

Labeling with games http://www.gwap.com/gwap/ L. von Ahn, L. Dabbish, 2004; L. von Ahn, R. Liu and M. Blum, 2006

LabelMe http://labelme.csail.mit.edu/ Russell, Torralba, Murphy, Freeman, 2008

80 Million Tiny Images http://people.csail.mit.edu/torralba/tinyimages/

Dataset issues • How large is the degree of intra-class variability? • How “confusable” are the classes? • Is there bias introduced by the background? I.e., can we “cheat” just by looking at the background and not the object?

Caltech-101

Steps for statistical recognition • Representation • Specify the model for an object category • Bag of features, part-based, global, etc. • Learning • Given a training set,find the parameters of the model • Generative vs. discriminative • Recognition • Apply the model to a new test image

Bag-of-features models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Overview: Bag-of-features models • Origins and motivation • Image representation • Discriminative methods • Nearest-neighbor classification • Support vector machines • Generative methods • Naïve Bayes • Probabilistic Latent Semantic Analysis • Extensions: incorporating spatial information

Origin 1: Texture recognition • Texture is characterized by the repetition of basic elements or textons • For stochastic textures, it is the identity of the textons, not their spatial arrangement, that matters Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Origin 1: Texture recognition histogram Universal texton dictionary Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Origin 2: Bag-of-words models • Unordered document representation: frequencies of words from a dictionary Salton & McGill (1983)

US Presidential Speeches Tag Cloudhttp://chir.ag/phernalia/preztags/ Origin 2: Bag-of-words models • Unordered document representation: frequencies of words from a dictionary Salton & McGill (1983)

Bags of features for image classification • Extract features

Bags of features for image classification • Extract features • Learn “visual vocabulary”

Bags of features for image classification • Extract features • Learn “visual vocabulary” • Quantize features using visual vocabulary

Bags of features for image classification • Extract features • Learn “visual vocabulary” • Quantize features using visual vocabulary • Represent images by frequencies of “visual words”

1. Feature extraction • Regular grid • Vogel & Schiele, 2003 • Fei-Fei & Perona, 2005

1. Feature extraction • Regular grid • Vogel & Schiele, 2003 • Fei-Fei & Perona, 2005 • Interest point detector • Csurka et al. 2004 • Fei-Fei & Perona, 2005 • Sivic et al. 2005

1. Feature extraction • Regular grid • Vogel & Schiele, 2003 • Fei-Fei & Perona, 2005 • Interest point detector • Csurka et al. 2004 • Fei-Fei & Perona, 2005 • Sivic et al. 2005 • Other methods • Random sampling (Vidal-Naquet & Ullman, 2002) • Segmentation-based patches (Barnard et al. 2003)

1. Feature extraction ComputeSIFT descriptor [Lowe’99] Normalizepatch Detectpatches [Mikojaczyk and Schmid ’02] [Mata, Chum, Urban & Pajdla, ’02] [Sivic & Zisserman, ’03] Slide credit: Josef Sivic

… 1. Feature extraction

… 2. Learning the visual vocabulary

Statistical Object Recognition

Statistical Object Recognition

Presentation Transcript

OBJECT RECOGNITION

Generic Object Recognition

Object recognition

Dense Object Recognition

Object Recognition

Object Recognition

Visual Object Recognition

Object Recognition

Visual Object Recognition

Visual Object Recognition

Object recognition

Object Recognition

Object Recognition

Object Recognition

Multiclass object recognition

Object recognition

Object Recognition

Object recognition

Object Recognition I

Object Recognition