560 likes | 570 Views
This presentation covers concepts of calibration, triangulation, and structure from motion for robotic perception, including epipolar geometry, stereo matching, camera calibration methods, and more. Learn about camera parameters, depth information, and stereo vision applications.
E N D
Calibration, Triangulation and Structure from Motion Thursday, 10th October 2019 Samyak Datta Disclaimer: These slides have been borrowed from Derek Hoiem, Kristen Grauman and the Coursera course on Robotic Perception (by Jianbo Shi and Kostas Danilidis). Derek adapted many slides from Lana Lazebnik, Silvio Saverese, Steve Seitz, and took many figures from Hartley & Zisserman.
Reminder • PS3 is due on 10/16
Reminder • PS3 is due on 10/16 • Project proposals were due last night • 2 project late days • We’ll assign project teams to TAs
Topics overview • Intro • Features & filters • Grouping & fitting • Multiple views and motion • Homography and image warping • Image formation • Epipolar geometry and stereo • Structure from motion • Recognition • Video processing Slide credit: Kristen Grauman
Multi-View Geometry • Last time: Epipolar geometry • Relates cameras from two positions • Last time: Scan-line stereo • Finding corresponding points in a pair of stereo images • Structure from motion • Multiple views
Recap: Epipolar geometry X x x’ • Baseline – line connecting the two camera centers • Epipoles = intersections of baseline with image planes = projections of the other camera center • Epipolar Plane – plane containing baseline (1D family) • Epipolar Lines - intersections of epipolar plane with image planes (always come in corresponding pairs)
Key idea: Epipolar constraint X X X x x’ x’ x’ Potential matches for x have to lie on the corresponding line l’. Potential matches for x’ have to lie on the corresponding line l.
Basic stereo matching algorithm • For each pixel in the first image • Find corresponding epipolar line in the right image • Search along epipolar line and pick the best match • Triangulate the matches to get depth information • Simplest case: epipolar lines are scanlines • When does this happen?
Simplest Case: Parallel images • Image planes of cameras are parallel to each other and to the baseline • Camera centers are at same height • Focal lengths are the same • Then, epipolar lines fall along the horizontal scan lines of the images
Simplest Case: Parallel images Epipolar constraint: R = I t = (T, 0, 0) x x’ t The y-coordinates of corresponding points are the same
Basic stereo matching algorithm • If necessary, rectify the two stereo images to transform epipolar lines into scanlines • For each pixel x in the first image • Find corresponding epipolar scanline in the right image • Search the scanline and pick the best match x’
Correspondence search Left Right • Slide a window along the right scanline and compare contents of that window with the reference window in the left image • Matching cost: SSD or normalized correlation scanline Matching cost disparity
Correspondence search Left Right scanline SSD
Correspondence search Left Right scanline Norm. corr
Failures of correspondence search Occlusions, repetition Textureless surfaces
Depth from disparity X Similar triangles: z x x’ f f BaselineB O O’ disparity Disparity is inversely proportional to depth.
Topics overview • Class Intro • Features & filters • Grouping & fitting • Multiple views and motion • Homography and image warping • Image formation • Epipolar geometry and stereo • Calibration, Triangulation and Structure from motion • Recognition • Video processing Slide credit: Kristen Grauman
General Case: Stereo with calibrated cameras Given image pair, R, T, and camera parameters K, K’ Detect some features Compute essential matrix E Match features using the epipolar and other constraints Triangulate for 3d structure Not yet discussed: • How to calibrate a camera? • How to triangulate a 3D point?
Today • Calibration • Triangulation • Structure from Motion • Example stereo and SfM applications
Calibrating the Camera Method 1: Weak calibration • Using corresponding points between two images to estimate the fundamental matrix F
Calibrating the Camera Method 1: Weak calibration • Use the 8-point algorithm • Only available approach if the camera is not accessible, e.g. internet images
Calibrating the Camera Method 2: Use an object (calibration grid) with known geometry • Correspond image points to 3d points • Get least squares solution (or non-linear solution)
Linear method • Solve using linear least squares Ax=0 form
Calibrating the Camera • We have matrix P, which also depends on the camera pose. How to get K? • The first 3x3 submatrix, M, of P is the product (M = KR) of an upper triangular matrix K and rotation matrix R. • Factor M into KR using the QR matrix decomposition • t = K-1(p14, p24, p34)T
Calibration with linear method • Advantages: easy to formulate and solve • Disadvantages • Doesn’t tell you camera parameters • Can’t impose constraints, such as known focal length • Doesn’t minimize projection error • Non-linear methods are preferred • Define error as difference between projected points and measured points • Minimize error using Newton’s method or other non-linear optimization • Use linear method for initialization
Calibrating the Camera Vertical vanishing point (at infinity) Vanishing line Vanishing point Vanishing point Method 3: Use vanishing points • Find vanishing points corresponding to orthogonal directions
Today • Calibration • Triangulation • Structure from Motion • Example stereo and SfM applications
General Case: Stereo with calibrated cameras Given image pair, R, T, and camera parameters K, K’ Detect some features Compute essential matrix E Match features using the epipolar and other constraints Triangulate for 3d structure
Projective structure from motion Xj x1j x3j x2j P1 P3 P2 • Given: m images of n fixed 3D points • xij = Pi Xj , i = 1,… , m, j = 1, … , n • Problem: estimate m projection matrices Pi and n 3D points Xj from the mn corresponding 2D points xij Slides from Lana Lazebnik
Reconstruction ambiguity • If two cameras are calibrated, then reconstruction is possible up to a similarity transform • With no calibration info, we face projective ambiguity. A projective transformation Q of the structure and camera positions does not change the measured points, i.e., X = PX = (PQ-1)(QX) Further reading: Hartley & Zisserman p. 264-266
Projective structure from motion • Given: m images of n fixed 3D points • xij = Pi Xj , i = 1,… , m, j = 1, … , n • Problem: estimate m projection matrices Pi and n 3D points Xj from the mn corresponding points xij • With no calibration info, cameras and points can only be recovered up to a 4x4 projective transformation Q: • X → QX, P → PQ-1 • We can solve for structure and motion when • 2mn >= 11m +3n – 15 • For two cameras, at least 7 points are needed
Sequential structure from motion • Initialize motion from two images using fundamental matrix • Initialize structure by triangulation • For each additional view: • Determine projection matrix of new camera using all the known 3D points that are visible in its image – calibration points cameras
Sequential structure from motion • Initialize motion from two images using fundamental matrix • Initialize structure by triangulation • For each additional view: • Determine projection matrix of new camera using all the known 3D points that are visible in its image – calibration • Refine and extend structure: compute new 3D points, re-optimize existing points that are also seen by this camera – triangulation points cameras