Combined Multi-View Object Class Recognition and Meta-Data Annotation

Combined Multi-View Object Class Recognition and Meta-Data Annotation Alexander Thomas Vittorio Ferrari, Bastian Leibe, Tinne Tuytelaars, Luc Van Gool

Goal: • Recognize unseen instances of an object class (e.g. cars, motorbikes, …) • Infer high-level information (meta-data) about this previously unseen object • Recognize objects from multiple viewpoints, not just a fixed view as in many current systems

Starting point:ISM System (Leibe & Schiele ’03) • Core idea: instances of a class can be described by combining parts from other instances • Implicit Shape Model • An object is recognized, if sufficient evidence is present in an image, in a plausible configuration

ISM: training (new) Extract features Codebook Match Cluster features Codebook + occurrences Attach local annotation mask to occurrences, derive segmentation

ISM: recognition • Match features from test image to codebook • each matching codebook entry casts votes in a Hough space (x,y,scale) (occurrence  vote) • Hypotheses = local peaks in Hough space • peaks are refined using mean-shift

Types of meta-data • Discrete • E.g. labeling different object ‘parts’, material types, interest areas • Real-valued • E.g. depth or heat maps • Vector-valued • E.g. orientations, colors, 3D points, motion vectors

Recognition: discrete meta-data • Use annotation patches to calculate Pj for each separate label j • Final pixel label = argmaxj(Pj) Training Test Output

3.18 Real-valued meta-data • For real-valued (including vector-valued) input, estimate mode for each pixel’s vote distribution by mean-shift • For quantized input, can obtain real-valued output through interpolation P(aj) j 0 1 2 3 4 5 6

H Avoiding holes in annotation • Unmatched areas  holes • Resampling step • Use spatial occurrence distribution to find additional matches • New matches cast votes like original interest points, and provide better coverage for annotation

Multi-View Recognition • Find corresponding regions in multiple views of training instances • Use these correspondences to link the ISMs of the different views together

Multi-View Recognition • During recognition, use relations (activation links) between views to cast additional votes

Experiment 1: wheelchairs

Wheelchairs: real-world images

Experiment 2: car parts

Cars: real-world images

Experiment 3: car 3D shape • Same car training set • Depth maps and surface orientations by manually aligning 3D models • Alternative methods: stereo, laser scanner, active lighting

Experiment 3: Results Image Ground truth Output Ground truth Output 3D information from a single image!

3D models from depth map output

Experiment 4: motorbikes • Multi-view: 16 views around object • No problem if an instance lacks some views • Parts annotation

Experiment 4: results • Tested on PASCAL VOC2005 & 2007

Conclusion • Method to simultaneously recognize and annotate objects in arbitrary poses with meta-data • Part labels • Depth maps • Surface orientations

Combined Multi-View Object Class Recognition and Meta-Data Annotation

Combined Multi-View Object Class Recognition and Meta-Data Annotation

Presentation Transcript

OBJECT RECOGNITION

Triangulation and Multi-View Geometry Class 9

Object recognition

Class Object and class Class

Object Removal in Multi-View Photos

Object Recognition

Object Recognition

Object Recognition

Object recognition

Multi-view Manhole Detection, Recognition and 3D Localisation

Object Recognition

Multi-View Super Vector for Action Recognition

Object Recognition

Object Recognition

Object recognition

Object Recognition

Object recognition

Object Removal in Multi-View Photos

Triangulation and Multi-View Geometry Class 9

Multi-View Super Vector for Action Recognition

Models for Multi-View Object Class Detection

Object Recognition