470 likes | 489 Views
Object Classes Most recent work is at the object level. We want in addition: 1. Individual Recognition. 2. Object parts and sub-parts Called: Full Interpretation. Window. Mirror. Window. Door knob. Headlight. Back wheel. Bumper. Front wheel. Headlight.
E N D
2. Object parts and sub-partsCalled: Full Interpretation Window Mirror Window Door knob Headlight Back wheel Bumper Front wheel Headlight
4. Agents Interactions 3 1 2 4 5 6
Features and Classifiers In DNN -- the net produces features of the top layer Previous work explored a broad range of features
Features used in the past: Generic Features Simple (wavelets) Complex (Geons)
Marr Net 2017 rotated versions of the object in the image
Optima Features: Mutual Information I(C,F) Class: 1 1 0 1 0 1 0 0 Feature: 1 0 0 1 1 1 0 0 I(F,C) = H(C) – H(C|F)
Star model Detected fragments ‘vote’ for the center location Find location with maximal vote In variations, a popular state-of-the art scheme
Hierarchies of sub-fragments(a ‘deep net’) Detect the part itself by simpler sub-parts Repeat at multiple levels, to obtain a hierarchy of parts and sub-parts
Classification by Features Hierarchy c x2 X1 X3 X4 X5 p(c,X,F) = p(c)Πp(xi|xi-) p(Fi|xi)
Global optimum can be found by max-sum message passing (two-pass computation) c x2 X1 X3 X4 X5 X = argmax [p(c,X,F) = p(c)Πp(xi|xi-) p(Fi|xi) ]
HoG Descriptor Dallal, N & Triggs, B. Histograms of Oriented Gradients for Human Detection SIFT is similar, different details, multi-scale
Optimal Separation SVM Perceptron The Nature of Statistical Learning Theory, 1995 Rosenblatt, Principles of Neurodynamics 1962. Find a separating plane such that the closest points are as far as possible
+1 The Margin -1 0 Separating line: w ∙ x + b = 0 Far line: w ∙ x + b = +1 Their distance: w ∙ ∆x = +1 Separation: |∆x| = 1/|w| Margin: 2/|w|
Using patches with HoG descriptors and classification by SVM Person model: HoG
Bicycle model: root, parts, spatial map Person model
A Neural Network Model A network of ‘neurons’ with multiple layers Repeating structure, linear, non-linear Automatic learning of weights between units
Perceptron learning yj = f(xj)
LeNet 1998 Essentially the same as the current generation
Hinton Trends in Cognitive Science 2007 The goal: unsupervised Restricted Boltzmann Machines Combining generative model and inference CNN are feed-forward and massively supervised
Basic structure of deep nets. Not detailed here, but make sure you know the layers structure and repeating 3-layer arrangement