240 likes | 253 Views
Combined Multi-View Object Class Recognition and Meta-Data Annotation. Alexander Thomas Vittorio Ferrari, Bastian Leibe, Tinne Tuytelaars, Luc Van Gool. Goal:. Recognize unseen instances of an object class (e.g. cars, motorbikes, …)
E N D
Combined Multi-View Object Class Recognition and Meta-Data Annotation Alexander Thomas Vittorio Ferrari, Bastian Leibe, Tinne Tuytelaars, Luc Van Gool
Goal: • Recognize unseen instances of an object class (e.g. cars, motorbikes, …) • Infer high-level information (meta-data) about this previously unseen object • Recognize objects from multiple viewpoints, not just a fixed view as in many current systems
Starting point:ISM System (Leibe & Schiele ’03) • Core idea: instances of a class can be described by combining parts from other instances • Implicit Shape Model • An object is recognized, if sufficient evidence is present in an image, in a plausible configuration
ISM: training (new) Extract features Codebook Match Cluster features Codebook + occurrences Attach local annotation mask to occurrences, derive segmentation
ISM: recognition • Match features from test image to codebook • each matching codebook entry casts votes in a Hough space (x,y,scale) (occurrence vote) • Hypotheses = local peaks in Hough space • peaks are refined using mean-shift
Types of meta-data • Discrete • E.g. labeling different object ‘parts’, material types, interest areas • Real-valued • E.g. depth or heat maps • Vector-valued • E.g. orientations, colors, 3D points, motion vectors
Recognition: discrete meta-data • Use annotation patches to calculate Pj for each separate label j • Final pixel label = argmaxj(Pj) Training Test Output
3.18 Real-valued meta-data • For real-valued (including vector-valued) input, estimate mode for each pixel’s vote distribution by mean-shift • For quantized input, can obtain real-valued output through interpolation P(aj) j 0 1 2 3 4 5 6
H Avoiding holes in annotation • Unmatched areas holes • Resampling step • Use spatial occurrence distribution to find additional matches • New matches cast votes like original interest points, and provide better coverage for annotation
Multi-View Recognition • Find corresponding regions in multiple views of training instances • Use these correspondences to link the ISMs of the different views together
Multi-View Recognition • During recognition, use relations (activation links) between views to cast additional votes
Experiment 3: car 3D shape • Same car training set • Depth maps and surface orientations by manually aligning 3D models • Alternative methods: stereo, laser scanner, active lighting
Experiment 3: Results Image Ground truth Output Ground truth Output 3D information from a single image!
Experiment 4: motorbikes • Multi-view: 16 views around object • No problem if an instance lacks some views • Parts annotation
Experiment 4: results • Tested on PASCAL VOC2005 & 2007
Conclusion • Method to simultaneously recognize and annotate objects in arbitrary poses with meta-data • Part labels • Depth maps • Surface orientations