Camera Calibration & Stereo Reconstruction

Camera Calibration & Stereo Reconstruction Jinxiang Chai

3D Computer Vision • The main goal here is to reconstruct geometry of 3D worlds.

How can we estimate the camera parameters? - Where is the camera located? - Which direction is the camera looking at? - Focal length, projection center, aspect ratio?

Stereo reconstruction • Given two or more images of the same scene or object, compute a representation of its shape • How can we estimate camera parameters? known camera viewpoints

Camera calibration • Augmented pin-hole camera • - focal point, orientation • - focal length, aspect ratio, center, lens distortion Known 3D • Classical calibration • - 3D 2D • - correspondence Camera calibration online resources

Camera and calibration target

Classical camera calibration • Known 3D coordinates and 2D coordinates • - known 3D points on calibration targets • - find corresponding 2D points in image using feature detection • algorithm

Camera parameters Known 3D coords and 2D coords sx а u0 u 0 -sy v0 v 0 0 1 1 Viewport proj. Perspective proj. View trans.

Camera parameters Known 3D coords and 2D coords sx а u0 u 0 -sy v0 v 0 0 1 1 Viewport proj. Perspective proj. View trans. Intrinsic camera parameters (5 parameters) extrinsic camera parameters (6 parameters)

Camera matrix • Fold intrinsic calibration matrix K and extrinsic pose parameters (R,t) together into acamera matrix • M = K [R | t ] • (put 1 in lower r.h. corner for 11 d.o.f.)

Camera matrix calibration • Directly estimate 11 unknowns in the M matrix using known 3D points (Xi,Yi,Zi) and measured feature positions (ui,vi)

Camera matrix calibration • Linear regression: • Bring denominator over, solve set of (over-determined) linear equations. How?

Camera matrix calibration • Linear regression: • Bring denominator over, solve set of (over-determined) linear equations. How? • Least squares (pseudo-inverse) - 11 unknowns (up to scale) - 2 equations per point (homogeneous coordinates) - 6 points are sufficient

Nonlinear camera calibration • Perspective projection:

Nonlinear camera calibration • Perspective projection: K R T P

Nonlinear camera calibration • Perspective projection: • 2D coordinates are just a nonlinear function of its 3D coordinates and camera parameters: K R T P

Multiple calibration images • Find camera parameters which satisfy the constraints from M images, N points: • for j=1,…,M • for i=1,…,N • This can be formulated as a nonlinear optimization problem:

Multiple calibration images • Find camera parameters which satisfy the constraints from M images, N points: • for j=1,…,M • for i=1,…,N • This can be formulated as a nonlinear optimization problem: Solve the optimization using nonlinear optimization techniques: - Gauss-newton - Levenberg-Marquardt

Nonlinear approach • Advantages: • can solve for more than one camera pose at a time • fewer degrees of freedom than linear approach • Standard technique in photogrammetry, computer vision, computer graphics - [Tsai 87] also estimates lens distortions (freeware @ CMU)http://www.cs.cmu.edu/afs/cs/project/cil/ftp/html/v-source.html • Disadvantages: • more complex update rules • need a good initialization (recover K [R | t] from M)

How can we estimate the camera parameters?

Application: camera calibration for sports video images Court model [Farin et. Al]

Stereo matching • Given two or more images of the same scene or object as well as their camera parameters, how to compute a representation of its shape? • What are some possible representations for shapes? • depth maps • volumetric models • 3D surface models • planar (or offset) layers

Outline • Stereo matching • - Traditional stereo • - Active stereo • Volumetric stereo • - Visual hull • - Voxel coloring • - Space carving

Readings • Stereo matching • 11.1, 11.2,.11.3,11.5 in Sezliski book • D. Scharstein and R. Szeliski. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms.International Journal of Computer Vision, 47(1/2/3):7-42, April-June 2002.

Stereo scene point image plane optical center

Stereo • Basic Principle: Triangulation • Gives reconstruction as intersection of two rays • Requires • calibration • point correspondence

epipolar plane epipolar line Stereo correspondence • Determine Pixel Correspondence • Pairs of points that correspond to same scene point epipolar line • Epipolar Constraint • Reduces correspondence problem to 1D search along conjugateepipolar lines • Java demo: http://www.ai.sri.com/~luong/research/Meta3DViewer/EpipolarGeo.html

Stereo image rectification

Stereo image rectification • reproject image planes onto a common • plane parallel to the line between optical centers • pixel motion is horizontal after this transformation • two homographies (3x3 transform), one for each input image reprojection • C. Loop and Z. Zhang. Computing Rectifying Homographies for Stereo Vision. IEEE Conf. Computer Vision and Pattern Recognition, 1999.

Rectification Original image pairs Rectified image pairs

Stereo matching algorithms • Match Pixels in Conjugate Epipolar Lines • Assume brightness constancy • This is a tough problem • Numerous approaches • A good survey and evaluation: http://www.middlebury.edu/stereo/

For each epipolar line For each pixel in the left image • Improvement: match windows • This should look familiar.. (cross correlation or SSD) • Can use Lukas-Kanade or discrete search (latter more common) Your basic stereo algorithm • compare with every pixel on same epipolar line in right image • pick pixel with minimum matching cost

W = 3 W = 20 Window size • Smaller window - • Larger window - • Effect of window size

More constraints? • We can enforce more constraints to reduce matching ambiguity • - smoothness constraints: computed disparity at a pixel • should be consistent with neighbors in a surrounding window. • - uniqueness constraints: the matching needs to be bijective • - ordering constraints: e.g., computed disparity at a pixel • should not be larger than the disparity of its right neighbor pixel by • more than one pixel.

Stereo results • Data from University of Tsukuba • Similar results on other images without ground truth Scene Ground truth

Results with window search Window-based matching (best window size) Ground truth

Better methods exist... • A better method • Boykov et al., Fast Approximate Energy Minimization via Graph Cuts, • International Conference on Computer Vision, September 1999. Ground truth

More recent development • High-Quality Single-Shot Capture of Facial Geometry [siggraph 2010, project website] • - capture high-fidelity facial geometry from multiple cameras • - pairwise stereo reconstruction between neighboring cameras • - hallucinate facial details

More recent development • High Resolution Passive Facial Performance Capture [siggraph 2010, project website] • - capture dynamic facial geometry from multiple video cameras • - spatial stereo reconstruction for every frame • - building temporal correspondences across the entire sequence

Stereo reconstruction pipeline • Steps • Calibrate cameras • Rectify images • Compute disparity • Estimate depth

Stereo reconstruction pipeline • Steps • Calibrate cameras • Rectify images • Compute disparity • Estimate depth • Camera calibration errors • Poor image resolution • Occlusions • Violations of brightness constancy (specular reflections) • Large motions • Low-contrast image regions • What will cause errors?

Outline • Stereo matching • - Traditional stereo • - Active stereo • Volumetric stereo • - Visual hull • - Voxel coloring • - Space carving

camera 1 camera 1 projector projector camera 2 Active stereo with structured light • Project “structured” light patterns onto the object • simplifies the correspondence problem Li Zhang’s one-shot stereo

Active stereo with structured light

Laser scanning • Optical triangulation • Project a single stripe of laser light • Scan it across the surface of the object • This is a very precise version of structured light scanning • Digital Michelangelo Project • http://graphics.stanford.edu/projects/mich/

Laser scanned models The Digital Michelangelo Project, Levoy et al.

Recent development • Capturing dynamic facial movement using active stereo [project website] • - use synchronized video cameras and structured light projectors to capture dynamic facial geometry • - use a generic 3D model to build temporal correspondences across the entire sequence

Camera Calibration & Stereo Reconstruction