• 610 likes • 735 Views
Recursive Composition in Computer Vision. Leo Zhu CSAIL MIT Joint work with Chen, Yuille, Freeman and Torralba. Ideas behind Recursive Composition. Pattern Theory. Grenander 94 Compositionality. Geman 02, 06 Stochastic Grammar. Zhu and Mumford 06. How to deal with image complexity
E N D
Recursive Composition in Computer Vision Leo Zhu CSAIL MIT Joint work with Chen, Yuille, Freeman and Torralba
Ideas behind Recursive Composition Pattern Theory. Grenander 94 Compositionality. Geman 02, 06 Stochastic Grammar. Zhu and Mumford 06 • How to deal with image complexity • A general framework for different vision tasks • Rich representation and tractable computation
Recursive Composition • Representation • Recursive Compositional Models (RCMs) • Inference • Recursive Optimization • Learning • Supervised Parameter Estimation • Unsupervised Recursive Dictionary Learning • RCM-1: Deformable Object • RCM-2: Articulated Object • RCM-3: Scene (Entire Image)
Model Deformable Object • Flat MRF • Nodes: object parts • Edges: spatial relations • Limitations: • Short range interaction • Sparse
Recursive Compositional Models:RCM-1 x: image y: (position, scale, orientation) graph=(nodes, edges) a: index of node b: child of a f: appearances on node a g: potentials on edges (a,b)
RCM-1: the Recursive Formula Recursion x: image ; y: (position, scale, orientation); Vertical independency; Self-similarity;
Recursive Composition • Representation • Recursive Compositional Models (RCMs) • Inference • Recursive Optimization • Learning • Supervised Parameter Estimation • Unsupervised Recursive Dictionary Learning
Polynomial-time Inference Recursion • Polynomial-time Complexity: Inference task: Recursive Optimization:
Supervised Learning Collins 02. Taskar et al. 04 • Supervised learning • Perceptron algorithm (MLE, max margin – svm) • Parameter estimation needs fast inference.
Supervised learning by Perceptron Algorithm where Inference is critical for learning • Goal: • Input: a set of training images with ground truth . Initialize parameter vector. • Training algorithm (Collins 02): Loop over training samples: i = 1 to N Step 1: find the best using inference: Step 2: Update the parameters: End of Loop.
Recursive Composition • Representation • Recursive Compositional Models (RCMs) • Inference • Recursive Optimization (Polynomial-time) • Learning • Supervised Parameter Estimation • RCM-1: Deformable Object
RCM-1: Multi-level Potentials = * [ Gabor, Edge, …] Potentials for appearance
RCM-1: Multi-level Potentials (position, scale, orientation) • Potentials for shape: triplet descriptors
Evaluations: Segmentation and Parsing • Segmentation (Accuracy of pixel labeling) • The proportion of the correct pixel labels (object or non-object) • Parsing (Average Position Error of matching) • The average distance between the positions of leaf nodes of the ground truth and those estimated in the parse tree
Recursive Composition • Modeling: (Representation) • Recursive Compositional Models (RCMs) • Inference: (Computing) • Recursive Optimization (Polynomial-time) • Learning: • Supervised Parameter Estimation • Unsupervised Recursive Learning • RCM-1: deformable object
Unsupervised Learning Correspondence is unknown ? Combinatorial Explosion problem • Task: given 10 training images, no labeling, no alignment, highly ambiguous features. • Induce the structure (nodes and edges) • Estimate the parameters.
Recursive Dictionary Learning Recursion Barlow 94. • Multi-level dictionary (layer-wise greedy) • Bottom-Up and Top-Down recursive procedure • Three Principles: • Recursive Composition • Suspicious Coincidence • Competitive Exclusion
Bottom-up Learning Suspicious Coincidence Clustering Composition Competitive Exclusion
The Dictionary: From Generic Parts to Object Structures Unified representation (RCMs) and learning Bridge the gap between the generic features and specific object structures
Dictionary Size, Part Sharing and Computational Complexity More Sharing
Top-down refinement Fill in missing parts Examine every node from top to bottom
Scale up the System: Issue I More classes/viewpoints -> more training/detection cost
Scale up the System: Issue II No enough data for rare viewpoints/classes
Our Strategy Joint multi-class multi-view learning Appearance sharing Part sharing
Joint Multi-Class Multi-View Learning 120 templates: 5 viewpoints & 26 classes
The more classes/viewpoints, the more amount of part sharing
Recursive Composition • Representation • Recursive Compositional Models (RCMs) • Inference • Recursive Optimization (Polynomial-time) • Learning • Supervised Parameter Estimation • RCM-1: Deformable Object • RCM-2: Articulated Object
RCM-2 for Articulated Object: Horses multiple poses y=(switch, position, scale, orientation) Composition Switch
Recursive Composition • Representation • Recursive Compositional Models (RCMs) • Inference • Recursive Optimization (Polynomial-time) • Learning • Supervised Parameter Estimation • RCM-1: Deformable Object • RCM-2: Articulated Object • RCM-3: Scene (Entire Image)
Image Scene Parsing Task: Image Segmentation andLabeling