Shape(Structure) From X

Shape(Structure)FromX • 解决的是从2D图像到2.5D表面形状(场景深度)的问题 • Shape from motion • Shape from stereo • Shape from monocular cues(shading, vanishing point, defocus, texture,….)

第七章基于运动视觉的场景复原

三维运动估计 三维运动估计是指从二维图象序列来估计物体三维运动参数以及三维结构。 SFM (Structure From Motion)

三维刚体运动

小角度旋转 小角度旋转矩阵

1. 基于正交投影的三维运动估计 小角度旋转矩阵 6个未知数，3对点

基于正交投影的三维运动估计 • Aizawa, 1989 1. 根据对应点和深度估计值，计算运动参数交替直到稳定 2. 根据运动参数和对应点，重新估计深度

基于正交投影的三维运动估计 • Bozdagi, 1994 利用深度估计值的随机扰动，跳出局部最优 1. 根据对应点和深度估计值，计算运动参数 2. 根据运动参数和深度估计值，估计对应点坐标 3. 计算估计误差

基于正交投影的三维运动估计 4. 随机扰动深度估计值 5. 重复以上步骤实验证明，这种改进的迭代算法在初始深度值有50%的误差的情况下，也能很好地收敛到正确的运动参数值。

2 基于透视投影模型的三维运动估计 规范化焦距F=1,分子分母同除以Zk

3 基于外极线的三维运动估计 外极线方程几何意义

基于外极线的三维运动估计 • 外极线方程三维刚体运动引进一个反对称矩阵：

基于外极线的三维运动估计 • 基本矩阵（essential matrix）平移矢量乘以不为零的系数，不影响外极线方程成立所恢复的运动参数是关于比例系数的解

本质矩阵的应用 • 可被用于 • 简化匹配问题 • 检测错误的匹配

基于外极线的三维运动估计 • 外极线方程

基于外极线的三维运动估计 基本矩阵的性质外极线方程的待求参数 5个未知的独立的参数，这也和运动参数的自由度数量相一致，即三个旋转自由度，二个平移自由度（或三个关于一个比例系数的平移自由度）.

(1) 根据基本矩阵估计运动 1. 计算基本矩阵 8对以上对应点求稳定解(实际经常使用RANSAC算法)

(1) 根据基本矩阵估计运动 1. 计算基本矩阵 • In reality, instead of solving , we seek Etominimize , least eigenvector of .

8-point algorithm To enforce that E is of rank 2, E is replaced by E’ that minimizes subject to . • It is achieved by SVD. Let , where , let then is the solution.

8-point algorithm % Build the constraint matrix A = [x2(1,:)‘.*x1(1,:)' x2(1,:)'.*x1(2,:)' x2(1,:)' ... x2(2,:)'.*x1(1,:)' x2(2,:)'.*x1(2,:)' x2(2,:)' ... x1(1,:)' x1(2,:)' ones(npts,1) ]; [U,D,V] = svd(A); % Extract fundamental matrix from the column of V % corresponding to the smallest singular value. E = reshape(V(:,9),3,3)'; % Enforce rank2 constraint [U,D,V] = svd(E); E = U*diag([D(1,1) D(2,2) 0])*V';

Problem with 8-point algorithm ! ~100 ~10000 ~100 ~10000 ~10000 ~100 ~100 1 ~10000 Orders of magnitude difference between column of data matrix  least-squares yields poor results

Normalized 8-point algorithm normalized least squares yields good results Transform image to ~[-1,1]x[-1,1] (0,500) (700,500) (-1,1) (1,1) (0,0) (0,0) (700,0) (-1,-1) (1,-1)

Normalized 8-point algorithm A = [x2(1,:)‘.*x1(1,:)' x2(1,:)'.*x1(2,:)' x2(1,:)' ... x2(2,:)'.*x1(1,:)' x2(2,:)'.*x1(2,:)' x2(2,:)' ... x1(1,:)' x1(2,:)' ones(npts,1) ]; [U,D,V] = svd(A); E = reshape(V(:,9),3,3)'; [U,D,V] = svd(E); E = U*diag([D(1,1) D(2,2) 0])*V'; [x1, T1] = normalise2dpts(x1); [x2, T2] = normalise2dpts(x2); % Denormalise E = T2'*E*T1;

Normalization function [newpts, T] = normalise2dpts(pts) c = mean(pts(1:2,:)')'; % Centroid newp(1,:) = pts(1,:)-c(1); % Shift origin to centroid. newp(2,:) = pts(2,:)-c(2); meandist = mean(sqrt(newp(1,:).^2 + newp(2,:).^2)); scale = sqrt(2)/meandist; T = [scale 0 -scale*c(1) 0 scale -scale*c(2) 0 0 1 ]; newpts = T*pts;

RANSAC repeat select minimal sample (8 matches) compute solution(s) for F determine inliers until (#inliers,#samples)<95% || too many times compute E based on all inliers

根据基本矩阵估计运动 2. 估计运动参数 T: 根据基本矩阵的性质 R: 根据

(2) 直接根据外极线方程估计运动 理想情况下：由于误差，改求：

Structure from motion

Structure from motion structure for motion: automatic recovery of camera motion and scene structure from two or more images. It is a self calibration technique and called automatic camera tracking or matchmoving. Unknown camera viewpoints

坐标转换 Model-view Transformation Camera Coordinate System World Coordinate System

世界坐标系  相机坐标系 Camera Parameter Camera Projection Matrix Intrinsic Extrinsic

对于同一场景点，拍摄一张图像

对于同一场景点，使用同样的相机设置拍摄两张图像对于同一场景点，使用同样的相机设置拍摄两张图像

对于同一场景点，使用同样的相机设置拍摄三张图像对于同一场景点，使用同样的相机设置拍摄三张图像

Image 1 Image 3 R3,t3 R1,t1 Image 2 R2,t2

Point 1 Point 2 Point 3 Image 1 Image 2 Image 3 Same Camera Same Setting = Same

Triangulation Image 1 Image 3 R3,t3 R1,t1 Image 2 R2,t2

相机内部参数矩阵 • Principle point offset • especially when images are cropped (Internet) • Skew • Radial distortion (due to optics of the lens)

Steps Images  Points: Structure from Motion Points  More points: Multiple View Stereo Points  Meshes: Model Fitting Meshes  Models: Texture Mapping Images  Models: Image-based Modeling + =

Steps + + = Images  Points: Structure from Motion Points  More points: Multiple View Stereo Points  Meshes: Model Fitting Meshes  Models: Texture Mapping Images  Models: Image-based Modeling + =

Steps + + + Images  Points: Structure from Motion Points  More points: Multiple View Stereo Points  Meshes: Model Fitting Meshes  Models: Texture Mapping Images  Models: Image-based Modeling + = =

Steps + + = Images  Points: Structure from Motion Points  More points: Multiple View Stereo Points  Meshes: Model Fitting Meshes  Models: Texture Mapping Images  Models: Image-based Modeling + =

Pipeline Structure from Motion (SFM) Multi-view Stereo (MVS)

Two-view Reconstruction

Shape(Structure) From X