Object Recognizing

Object Recognizing

Object Classes

Individual Recognition

Object partsFull Interpretation Window Mirror Window Door knob Headlight Back wheel Bumper Front wheel Headlight

Action recognition (except 2)

Class Non-class

Is this an airplane?

Features and Classifiers Same features with different classifiers Same classifier with different features

Generic Features Simple (wavelets) Complex (Geons)

Marr-Nishihara

Mental Rotation

3-D Parts • Implementations – poor results • View-specific recognition • fMRI studies • Instead: Using image patches

Class-specific Features: Common Building Blocks

Optimal Class Components? • Large features are too rare • Small features are found everywhere Find features that carry the highest amount of information

Mutual information H(C) F=0 F=1 H(C) when F=1 H(C) when F=0 I(C;F) = H(C) – H(C/F)

Mutual Information I(C,F) Class: 1 1 0 1 0 1 0 0 Feature: 1 0 0 1 1 1 0 0 I(F,C) = H(C) – H(C|F)

Horse-class features Car-class features Pictorial features Learned from examples

Star model Detected fragments ‘vote’ for the center location Find location with maximal vote In variations, a popular state-of-the art scheme

Recognition Features in the Brain

fMRI Functional Magnetic Resonance Imaging

תמונות של פעילות המח

LO object recognition V1 early processing

Class-fragments and Activation Malach et al 2008

Bag of words

– Bag of visual words A large collection of image patches

– – – Each class has its words historgram Limited or no Geometry Simple and popular Visual words are used, but not for full recognition model

HoG Descriptor Dallal, N & Triggs, B. Histograms of Oriented Gradients for Human Detection

SIFT: Scale-invariant Feature Transform • MSER: Maximally Stable Extremal Regions • SURF: Speeded-up Robust Features • Cross correlation • …. • HoG and SIFT are the most widely used.

DPM Felzenszwalb • Felzenszwalb, McAllester, Ramanan CVPR 2008. A Discriminatively Trained, Multiscale, Deformable Part Model • Many implementation details, will describe the main points.

HoG descriptor

Using patches with HoG descriptors and classification by SVM Person model: HoG

Object model using HoG A bicycle and its ‘root filter’ The root filter is a patch of HoG descriptor Image is partitioned into 8x8 pixel cells In each block we compute a histogram of gradient orientations

Dealing with scale: multi-scale analysis The filter is searched on a pyramid of HoG descriptors, to deal with unknown scale

Adding Parts A part Pi = (Fi, vi, si, ai, bi). Fi is filter for the i-th part, vi is the center for a box of possible positions for part i relative to the root position, si the size of this box ai and bi are two-dimensional vectors specifying coefficients of a quadratic function measuring a score for each possible placement of the i-th part. That is, ai and bi are two numbers each, and the penalty for deviation ∆x, ∆y from the expected location is a1 ∆x + a2 ∆y+ b1 ∆x2 + b2 ∆y2

Bicycle model: root, parts, spatial map Person model

Object Recognizing