1 / 55

Structure from Motion

Structure from Motion. ECE 847: Digital Image Processing. Stan Birchfield Clemson University. SVD. Any mxn matrix A can be decomposed as where This is the singular value decomposition (SVD). mxm. mxn. nxn. =. Tall and short matrices. Tall matrix. m>n, p = n. mxm. mxn. nxn.

shaferm
Download Presentation

Structure from Motion

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Structure from Motion ECE 847:Digital Image Processing Stan Birchfield Clemson University

  2. SVD • Any mxn matrix A can be decomposed as • where • This is the singular value decomposition (SVD) mxm mxn nxn

  3. = Tall and short matrices Tall matrix m>n, p = n mxm mxn nxn m<n, p = m Short matrix = mxm mxn nxn

  4. Compact version Tall matrix Tall matrix m>n, p = n = mxm mxn nxn m<n, p = m Short matrix Short matrix = mxm mxn nxn

  5. Compact version (cont.) Tall matrix Tall matrix m>n, p = n = mxn nxn nxn m<n, p = m Short matrix Short matrix = mxm mxm mxn

  6. SVD reveals structure • Let r be the index of the smallest non-zero singular value • Then • Easy to show:

  7. Eigen / singular • Singular values and singular vectorswork likeeigenvalues and eigenvectors: • First p eigenvalues of the Gramian ATA (or AAT) are squares of the singular values of A:

  8. Eigen / singular If A is real, then • right singular vector of A is eigenvector of ATA • left singular vector of A is eigenvector of AAT (If A is complex, then replace T with *, conjugate transpose) http://en.wikipedia.org/wiki/Singular_value_decomposition

  9. Condition number • A is non-singular if and only if • In real life, matrices are never singular. • The condition number of A is • If 1/C is near the machine’s precision, then A is ill-conditioned. It is dangerous to invert A.

  10. Norms Singular values readily yield norms: • Induced Euclidean norm: • Frobenius norm:(Euclidean norm, treating matrix as vector)

  11. Least squares where The set of equations is solved as or

  12. Least squares (cont.) • Minimum norm least squares solution to Ax=b, i.e., the shortest vector x that achieves is unique and is given by • where pseudoinverse inverts all nonzero singular values

  13. Homogeneous system • What if b is all zeros? • Then the minimum-norm solution is not interesting, b/c it will be x=0 always • Instead, find unit-norm solution • Solution is given by (the right singular vector associated with the smallest singular value)

  14. Enforcing constraints Find closest matrix to A in the sense of Frobenius norm that satisfies constraints exactly: • Factorize A = USVT • Change S to S’ to satisfy constraints • Put back together: A’ = US’VT Example: Enforce rank of A by setting small singular values to zero

  15. Geometric interpretation of SVD

  16. Structure from motion • Structure from motion (SFM) recovers • scene geometry • camera motion from a sequence of images • Could be called structure (or shape) and motion from video (SAMV), but nobody does this

  17. SFM preliminaries • Collect F frames of P points (with correspondence) • Camera coordinate system: centered at focal point and aligned with image axes (x and y in image, positive z along optical axis) • World coordinate system is coincident with first camera (arbitrary)

  18. SFM under perspective projection pth point xp-tf xp • Perspective imaging: • Equation counting: • 2FP+1 equations (extra equation from scale ambiguity) • 3P + 6(F-1) unknowns • Required:2FP+1 >= 3P + 6(F-1)With 2 frames, need at least 5 points if fth camera coord sys. tf world coord sys. jf

  19. Perspective: 2 frames of 5 points • Show graphically that with fewer than 5 points, there is always wiggle room between camera frames

  20. 8-point algorithm • Longuet-Higgins • Hartley normalization

  21. SFM under orthographic projection • Orthographic imaging ignores depth: • Equation counting: • 2FP+F equations (extra eqn. for each frame: set z motion to 0) • 3P + 6(F-1) unknowns (same as perspective) • But equations are not independent (complicated proof omitted) • 2 frames is not enough • With 3 frames, need at least 4 points

  22. Orthography: 3 frames of 4 points • Show graphically the wiggle room with < 3 frames or < 4 points

  23. Factorization rotation • Recall: • Stack into measurement matrix: 4xP 2FxP 2Fx4 (Tomasi and Kanade 1992) measurement = motion x shape

  24. Subtracting centroid • Place world origin at centroid of points: • Then subtract centroid of image coordinates per frame:

  25. Registered measurements • This leads to the registered measurement matrix: 3xP 2FxP 2Fx3 registered measurement = rotation x shape

  26. 3 3 0 3 0 0 Rank theorem • Similarly, • Use SVD to enforce rank constraint: • This reduces effects of noise in a robust, stable way

  27. Euclidean constraints • But our choice was arbitrary • Solution is unique only up to affine transformation • Impose metric constraints to solve for Q: for any invertible 3x3 matrix Q use least squares to find 6 parameters of symmetric matrix C=QQT, then SVD decomposition to get Q

  28. Finding Q Note: C is symmetric(C has 6 DOF b/c overall orientation of world coord. sys. is arbitrary) Solve for C: Then use SVD to get Q:

  29. Cholesky decomposition? • Some suggest using Cholesky decomposition to get Q • Problem: • Cholesky requires C to be positive definite, but no guarantee that it is • In return, Cholesky find a lower triangular Q, but we don’t care • Some say to Higham’s eigendecomposition approach (Higham, Computing a nearest symmetric positive semidefinite matrix, 1988), but after Higham’s method, no need to compute Cholesky anyway; so Higham’s method basically is no different from just using the SVD, which is much simpler • Solution: Use SVD instead

  30. Algorithm summary Tomasi-Kanade factorization for SFM:

  31. Results

  32. More results

  33. Handling occlusion Unknown image measurement pair (ufp,vfp) in frame f can be reconstructed if • p is visible in 3 image frames • 3 other points are visible in 4 frames

  34. Occlusion results ping pong ball rotated 450 degrees 84% of data hallucinated from 16%

  35. Factorization extensions • Poelman and Kanade (1994): Paraperspective • Costeira and Kanade (1995): Multibody factorization • Sturm and Triggs (1996): Perspective, fixed rank algorithm to speed computation multibody (Costeira and Kanade) results

  36. Non-rigid reconstruction Lorenzo Torresani and Christoph Bregler http://movement.stanford.edu/nonrig

  37. Live Dense Reconstruction with a Single Moving Camera Richard A. Newcombe and Andrew J. Davison http://www.doc.ic.ac.uk/~rnewcomb/CVPR2010

  38. Building Rome in a Day http://grail.cs.washington.edu/rome

  39. PMVS / CMVS http://grail.cs.washington.edu/software/cmvs

  40. BMVS http://sites.google.com/site/leeplus/bmvs

  41. Interactive 3D Architectural Modeling from Unordered Photo Collections http://cs.unc.edu/~ssinha/Research/sigasia08/index.html http://www.photocitygame.com/

  42. Reconstructing Building Interiors from Images http://grail.cs.washington.edu/projects/interior/

  43. KinectFusion http://research.microsoft.com/en-us/projects/surfacerecon/

  44. Debevec’s Campanile http://www.pauldebevec.com/Campanile/

  45. 3D representations Note the different 3D representations: • point clouds (Tomasi-Kanade factorization, building Rome in a day) • planes (reconstructing building interiors) • geometric primitives (Debevec’s Campanile) • voxels (KinectFusion)

  46. PTAM http://www.robots.ox.ac.uk/~gk

  47. More Non-Rigid Structure from Motion http://pages.cs.wisc.edu/~lizhang/projects/mevolve

  48. Planar parallax • See Irani

  49. Using dynamics • We have looked at batch methods. Now incremental methods. • A. Davison real-time reconstruction

  50. Texture mapping Depth image Texture image Triangle mesh • Pollefeys Textured 3D Wireframe model

More Related