460 likes | 950 Views
Rigid Structure from Video Pedro M. Q. Aguiar Outline Other methods - limitations Proposed approach Problem formulation Algorithms Experiments Motivation Segmentation of 2D rigid moving objects Inference of 3D rigid structure Content-based video representation
E N D
Rigid Structure from Video Pedro M. Q. Aguiar
Outline • Other methods - limitations • Proposed approach • Problem formulation • Algorithms • Experiments • Motivation • Segmentation of 2D rigid moving objects • Inference of 3D rigid structure
Content-based video representation apps: compression, non-linear editing, virtual reality, etc Motivation • Video • Generative Video (GV) [Jasinschi & Moura, 95] • flat scenario • flat moving objects • PROBLEM: Segmentation of 2D rigid moving objects • 3D content-based representation • 3D rigid shape • 3D motion • PROBLEM: Inference of 3D rigid structure (shape and motion)
Motion segmentation in low texture with low texture, segmentation fails ! • Two-frame motion-based segmentation • No prior knowledge about shape, texture • [Diehl, 91] time consumingalgorithms ! • Possible solution - smoothing • Statistical regularization [Dubuisson & Jain, 95] • Combine motion with other attributes [Bouthemy & François, 93] • Proposed approach - exploit rigidity over a set of frames • Explicit modeling of occlusion • Feasible implementation of MLE
Observation model background camera window camera position camera position object template (modeling of oclusion) object position object texture noise
Maximum Likelihood estimation • Given • set of F frames • Estimate • background texture • object texture • object template • camera motion • object motion • ML cost function over all frames and pixels • ML estimate
Minimization procedure • ML estimation quadratic in O and B average of the observations, after registration • Object and background estimates linear in T average of the observations, in the regions not occluded by the object nonlinear in T • Decouple the estimation of the position vectors • Motion is estimated on a frame by frame basis [Bergen et al, 92]
Minimization procedure - two-step iterative method • Replacing and in the ML cost function nonlinear minimization ! • Replacing only in the ML cost function • minimize using a two-step iterative method: • solve for with fixed • solve for with fixed (quadratic, closed-form solution) (linear, closed-form solution)
Minimization procedure - segmentation matrix Segmentation matrix • Template estimate • Replacing only in the ML cost function Accumulated differences between each pair of co-registered frames Accumulated differences between each frame and the background • regions where the test is inconclusive with the available F frames linear in T !
Experiment moving object three frames from the image sequence background
Experiment background estimate Two-step method template estimate
Experiment background estimate moving objects four frames from a video sequence
3D structure from 2D video • Motivation: 3D content-based video representation (application areas go well behind digital video) • Key step: recovery of 3D shape and 3D motion from an image sequence • Strongest cue: motion of the brightness pattern • Structure From Motion: • Step 1. Compute the 2D motion on the image plane • Step 2. Recover the 3D motion and the depth
Two-frame SFM - common problem • step 1. track feature points across a set of frames • step 2. recover relative depth and set of 3D positions • Two-frame SFM failswhen object is far from camera 3D • Solution: exploit rigidity - multi-frame SFM • Multi-frame Structure From Motion:
Factorization method expedite method • Factorization [Tomasi & Kanade, 92]: • uses linear subspace constraints • 3D structure is estimated by factorizing a measurement matrix R whose entries are the trajectories of feature point projections • without noise, R is rank 3. AnSVD is used to factorize matrix R • Multi-frame SFM - hard problem: • non-linear • large set of unknowns (due to the entire set of 3D positions) • Problems: • track a large set of features: computationally very heavy, if possible • cost of SVD: high for large number of features or frames
Proposed approach: surfaced-based factorization • Induces a parametric description for the 2D motion in the image plane • Recover the 3D shape and 3D motion parameters from the 2D motion parameters by further exploiting linear subspace constraints: • surface-based factorization • rank 1 factorization • weighted factorization uses a fast algorithm to compute only the largest singular value computes the weighted estimate without additional computational cost • Describe the 3D shape by a local parameterization
Maximum Likelihood formulation rather than the two components of the motion, local depth is a single unknown • Observations: the images in the sequence. Unknowns: object texture, 3D shape, 3D motion • Through ML, 3D structure is recovered: • Exploiting object rigidity over a set of frames • Directly from the image intensity values so, where do SFM and factorization come from ? • Minimization procedure : • Minimize with respect to the texture in terms of 3D shape and 3D motion • After replacing the texture estimate, the ML cost function depends on the 3D structure only through the 2D motion in the image plane • Estimate 3D motion by inferring SFM (factorization). Plug-in the 3D motion estimates • Minimize the ML cost function with respect to the relative depth • Local 2D motion estimation is ill-posed - aperture problem. Direct methods: • Infer 3D structure by using the brightness change constraintbetween two frames [Horn & Weldon, 88] • Kalman filter to update estimates over time [J. Hell, 90]
Observation model • Observation model texture shape 3D position • Unknowns:
Texture estimate • Texture estimate - weighted average • ML estimate
SFM as an approximation to MLE • The ML cost function depends on the 3D structure only through the 2D motion induced in the image plane (no approximations involved) • Insert the texture estimate into the cost function • 3D structure estimation: • 3D motion estimation: • Compute 2D motion • SFM: rank 1 surface-based factorization • 3D shape estimation: • Plug-in the 3D motion estimate into the ML cost function • Then, minimize with respect to the shape • (The estimates can be refined by minimizing the ML cost function in two alternate steps, • but initialization is the key problem)
Feature-based SFM Translation estimate: Define:
Rank 1 factorization • Decomposition (minimize without constraints) Define: • Normalization (computes by approximating the constraints) Define:
Rank 1 factorization - experiment three larger singularvalues of R matrix is well described by its largest singular value
Rank 1 factorization - experiment all trajectories have equal shape - it depends only on the 3D motion. The scaling factor depends on the 3D shape (relative depth) 3D shape and 3D motion are observed in a coupled way through the feature trajectories
Surface-based factorization • Orthographic projection (easily extended to scaled-orthographic and para-perspective projections) • 2D motion in the image plane is affine Relation between the parameters: • Rank 1 factorization Multi-frame SFM: • Piecewise planar 3D shapes
Surface-based factorization - experiment smooth texture image motion parameters image sequence
Surface-based factorization - experiment motion shape
Weighted factorization observation noise • rank 1 factorization
Weighted factorization - experiment non-weighted estimates weighted estimates two components of translation six entries of the rotation matrix feature trajectories
ML estimate of the 3D shape • Image motion: known motion parameters affine mapping that depends only on the 3D motion • Define a sequence: • Motion of the affine mapped sequence: unknown relative depth shape of the trajectory of s (known from 3D motion) magnitude of the trajectory of s (unknown relative depth) • Plug-in the 3D motion estimate into the ML cost function • Estimating the relative depth after plugging-in the 3D motion is more constrained than estimating the image motion • Motivation for the minimization procedure
Minimization procedure - multiresolution • Multiresolution continuation-type method • coarse-to-fine as more images are being taken into account • each stage minimizes the ML cost function by using a Gauss-Newton method components of the image gradient • Region R - constant relative depth z
Experiment • Image sequence: • and motion: • Shape
Experiment Affine mapped image sequence: • Shape:
Experiment without smoothing Multiresolution continuation-type method. Shape estimate:
Experiment • Synthesizing different views:
Application - video compression Original Compressed 317:1 Compressed 575:1 Texture patches JPEG compressed
Major contributions and extensions • Explicit modeling of occlusion • Multiframe motion segmentation algorithm (two-step) • Surface-based factorization • Rank 1 factorization • Weighted factorization • extension: contour model • extensions: • other projection models • multibody • occlusion • 3D deformable shape from a set of cameras • subspace constraints for image motion estimation • Multiresolution algorithm for direct inference of 3D shape • extension: parameterized surface model
Experiment Multiresolution continuation-type method. Shape estimate: