550 likes | 561 Views
Learn how SVD reveals matrix structure in digital image processing and its applications. Discover insights on condition numbers, norms, least squares, enforcing constraints, and geometric interpretation. Dive into Structure from Motion (SFM) concepts. Explore SFM under perspective and orthographic projection, factorization rotation, and rank theorems. Be guided on Euclidean constraints, finding Q, and the role of Cholesky decomposition.
E N D
Structure from Motion ECE 847:Digital Image Processing Stan Birchfield Clemson University
SVD • Any mxn matrix A can be decomposed as • where • This is the singular value decomposition (SVD) mxm mxn nxn
= Tall and short matrices Tall matrix m>n, p = n mxm mxn nxn m<n, p = m Short matrix = mxm mxn nxn
Compact version Tall matrix Tall matrix m>n, p = n = mxm mxn nxn m<n, p = m Short matrix Short matrix = mxm mxn nxn
Compact version (cont.) Tall matrix Tall matrix m>n, p = n = mxn nxn nxn m<n, p = m Short matrix Short matrix = mxm mxm mxn
SVD reveals structure • Let r be the index of the smallest non-zero singular value • Then • Easy to show:
Eigen / singular • Singular values and singular vectorswork likeeigenvalues and eigenvectors: • First p eigenvalues of the Gramian ATA (or AAT) are squares of the singular values of A:
Eigen / singular If A is real, then • right singular vector of A is eigenvector of ATA • left singular vector of A is eigenvector of AAT (If A is complex, then replace T with *, conjugate transpose) http://en.wikipedia.org/wiki/Singular_value_decomposition
Condition number • A is non-singular if and only if • In real life, matrices are never singular. • The condition number of A is • If 1/C is near the machine’s precision, then A is ill-conditioned. It is dangerous to invert A.
Norms Singular values readily yield norms: • Induced Euclidean norm: • Frobenius norm:(Euclidean norm, treating matrix as vector)
Least squares where The set of equations is solved as or
Least squares (cont.) • Minimum norm least squares solution to Ax=b, i.e., the shortest vector x that achieves is unique and is given by • where pseudoinverse inverts all nonzero singular values
Homogeneous system • What if b is all zeros? • Then the minimum-norm solution is not interesting, b/c it will be x=0 always • Instead, find unit-norm solution • Solution is given by (the right singular vector associated with the smallest singular value)
Enforcing constraints Find closest matrix to A in the sense of Frobenius norm that satisfies constraints exactly: • Factorize A = USVT • Change S to S’ to satisfy constraints • Put back together: A’ = US’VT Example: Enforce rank of A by setting small singular values to zero
Structure from motion • Structure from motion (SFM) recovers • scene geometry • camera motion from a sequence of images • Could be called structure (or shape) and motion from video (SAMV), but nobody does this
SFM preliminaries • Collect F frames of P points (with correspondence) • Camera coordinate system: centered at focal point and aligned with image axes (x and y in image, positive z along optical axis) • World coordinate system is coincident with first camera (arbitrary)
SFM under perspective projection pth point xp-tf xp • Perspective imaging: • Equation counting: • 2FP+1 equations (extra equation from scale ambiguity) • 3P + 6(F-1) unknowns • Required:2FP+1 >= 3P + 6(F-1)With 2 frames, need at least 5 points if fth camera coord sys. tf world coord sys. jf
Perspective: 2 frames of 5 points • Show graphically that with fewer than 5 points, there is always wiggle room between camera frames
8-point algorithm • Longuet-Higgins • Hartley normalization
SFM under orthographic projection • Orthographic imaging ignores depth: • Equation counting: • 2FP+F equations (extra eqn. for each frame: set z motion to 0) • 3P + 6(F-1) unknowns (same as perspective) • But equations are not independent (complicated proof omitted) • 2 frames is not enough • With 3 frames, need at least 4 points
Orthography: 3 frames of 4 points • Show graphically the wiggle room with < 3 frames or < 4 points
Factorization rotation • Recall: • Stack into measurement matrix: 4xP 2FxP 2Fx4 (Tomasi and Kanade 1992) measurement = motion x shape
Subtracting centroid • Place world origin at centroid of points: • Then subtract centroid of image coordinates per frame:
Registered measurements • This leads to the registered measurement matrix: 3xP 2FxP 2Fx3 registered measurement = rotation x shape
3 3 0 3 0 0 Rank theorem • Similarly, • Use SVD to enforce rank constraint: • This reduces effects of noise in a robust, stable way
Euclidean constraints • But our choice was arbitrary • Solution is unique only up to affine transformation • Impose metric constraints to solve for Q: for any invertible 3x3 matrix Q use least squares to find 6 parameters of symmetric matrix C=QQT, then SVD decomposition to get Q
Finding Q Note: C is symmetric(C has 6 DOF b/c overall orientation of world coord. sys. is arbitrary) Solve for C: Then use SVD to get Q:
Cholesky decomposition? • Some suggest using Cholesky decomposition to get Q • Problem: • Cholesky requires C to be positive definite, but no guarantee that it is • In return, Cholesky find a lower triangular Q, but we don’t care • Some say to Higham’s eigendecomposition approach (Higham, Computing a nearest symmetric positive semidefinite matrix, 1988), but after Higham’s method, no need to compute Cholesky anyway; so Higham’s method basically is no different from just using the SVD, which is much simpler • Solution: Use SVD instead
Algorithm summary Tomasi-Kanade factorization for SFM:
Handling occlusion Unknown image measurement pair (ufp,vfp) in frame f can be reconstructed if • p is visible in 3 image frames • 3 other points are visible in 4 frames
Occlusion results ping pong ball rotated 450 degrees 84% of data hallucinated from 16%
Factorization extensions • Poelman and Kanade (1994): Paraperspective • Costeira and Kanade (1995): Multibody factorization • Sturm and Triggs (1996): Perspective, fixed rank algorithm to speed computation multibody (Costeira and Kanade) results
Non-rigid reconstruction Lorenzo Torresani and Christoph Bregler http://movement.stanford.edu/nonrig
Live Dense Reconstruction with a Single Moving Camera Richard A. Newcombe and Andrew J. Davison http://www.doc.ic.ac.uk/~rnewcomb/CVPR2010
Building Rome in a Day http://grail.cs.washington.edu/rome
PMVS / CMVS http://grail.cs.washington.edu/software/cmvs
BMVS http://sites.google.com/site/leeplus/bmvs
Interactive 3D Architectural Modeling from Unordered Photo Collections http://cs.unc.edu/~ssinha/Research/sigasia08/index.html http://www.photocitygame.com/
Reconstructing Building Interiors from Images http://grail.cs.washington.edu/projects/interior/
KinectFusion http://research.microsoft.com/en-us/projects/surfacerecon/
Debevec’s Campanile http://www.pauldebevec.com/Campanile/
3D representations Note the different 3D representations: • point clouds (Tomasi-Kanade factorization, building Rome in a day) • planes (reconstructing building interiors) • geometric primitives (Debevec’s Campanile) • voxels (KinectFusion)
PTAM http://www.robots.ox.ac.uk/~gk
More Non-Rigid Structure from Motion http://pages.cs.wisc.edu/~lizhang/projects/mevolve
Planar parallax • See Irani
Using dynamics • We have looked at batch methods. Now incremental methods. • A. Davison real-time reconstruction
Texture mapping Depth image Texture image Triangle mesh • Pollefeys Textured 3D Wireframe model