3D Object Recognition Pipeline

3D Object Recognition Pipeline Morgan Quigley, Stephen Gould Stanford Marius Muja UBC Kurt Konolige, RaduRusu, Victor Eruhmov, SuatGedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM

3D and Object Recognition • Provides more info than just visual texture • Good for scale and segmentation • Verification Need a good device for 3D info

3D Cameras • Tabletop manipulation: • Short range • High resolution • High range accuracy • Real-time

WG Projected Texture Stereo Device • Paint the scene with texture from a projector • vs. single camera with structured light • Advantages: • Simple projector • Standard algorithms • Full frame rates (640x480) • Dynamic scenes

WG project texture device • Projector • Red LED • Eye safe • Synchronized to cameras 3D Fly-thru

Object Recognition Pipeline Pre-filter Detect Verify • Textured objects via keypoints[Victor Eruhimov, SuatGedikli] • Untextured objects via DOT [Stefan Holzer, Stefan Hinterstoisser] • Simple 3D model matching [Marius Muja] • STAIR 2D/3D features [Stephen Gould]

MOPED – Textured object recognition with pose • Model: Stereo view of an object at a known pose • Extract keypoints and features • For a new scene, match keypoints to each model • Run SfM geometric check to verify and recover pose Torres, Romea, Srinivasa ICRA 2010

Need texture • Need high res camera

Dominant Orientation Templates (DOT) Stefan Hinterstoisser, Stefan Holzer (TUM; CVPR 2010, ECCV 2010) DOT is a template matching based approach template current scene - Template is slid over the image to compute the response for each image position - If response is above a threshold it is considered as detection of the template

DOT – Basic Principle DOT uses gradients instead of color or gray values template current scene - Gradients are less sensitive to illumination changes - Gradients have orientation and magnitude

Offline Learning Good learning is necessary to reduce false-positive rate We try to use all available information to segment the object: Point cloud from narrow stereo is used to detect the table and segment the point cloud of the object Object point cloud is used to create an initial mask Mask is refined using GrabCut (see OpenCV)

False-Positive Rejection Two more precise templates for validation: more precise and not discretized gradient template disparity template to compare expected with real disparities

False-Positive Rejection Compute error between reference point cloud and point cloud at detected position Optimize initial 3D point cloud pose given from the detection Directly gives object pose if model is associated with learned point clouds

STAIR Vision Library (SVL)Stanford STAIR project [Andrew Ng, Stephen Gould] • Initially developed to support the Stanford AI Robot (STAIR) project • Builds on top of OpenCV computer vision library and Eigen matrix library • Provides a range of software infrastructure for • computer vision • machine learning • probabilistic graphical models • Hosted on SourceForge

Sliding-window object detector Features are extracted from a local window Learned boosted decision-tree classifier scores each window Image is scanned at multiple resolutions to detect objects at different scales Object Detection in SVL

Image decomposed into multiple channels Depth at each pixel, obtained from a laser scanner, can be thought of as an additional channel [Quigley et al., ICRA 2009] intensity image edge map depth map Image Channels

Learn a “patch” dictionary over intensity, edge and depth channels Patches encode localized templates for matching Depth patches capture shape; intensity and edge patches capture appearance Patch responses (over entire dictionary) are combined to form the feature vector [Quigley et al., ICRA 2009] Object Detection Features

150 images of cluttered indoor scenes 5-fold cross-validation Depth information provides significant improvement in area under precision-recall curve [Quigley et al., ICRA 2009] Results 8% improvement 3% improvement 38% improvement

Conclusions • Realtime, accurate 3D devices are becoming available • 3D can help in object detection for untextured objects • Combo of visual and 3D features best • 3D is useful for verification • Check out the PR2 Grasping Demo!

3D Object Recognition Pipeline

3D Object Recognition Pipeline

Presentation Transcript

3D Graphics Pipeline

The 3D Pipeline

OBJECT RECOGNITION

Algebraic Functions of Views for 3D Object Recognition

Object recognition

Object Recognition

Object Recognition

Visual Object Recognition

Object Recognition

Visual Object Recognition

Object recognition

Object Recognition

Object Recognition

3D Object Recognition U sing Computer Vision

Object Recognition

Object recognition

Object Recognition

Object recognition

3D Viewing Pipeline

The 3D Pipeline

Object Recognition