Rigid Structure from Video

Rigid Structure from Video Pedro M. Q. Aguiar

Outline • Other methods - limitations • Proposed approach • Problem formulation • Algorithms • Experiments • Motivation • Segmentation of 2D rigid moving objects • Inference of 3D rigid structure

Content-based video representation apps: compression, non-linear editing, virtual reality, etc Motivation • Video • Generative Video (GV) [Jasinschi & Moura, 95] • flat scenario • flat moving objects • PROBLEM: Segmentation of 2D rigid moving objects • 3D content-based representation • 3D rigid shape • 3D motion • PROBLEM: Inference of 3D rigid structure (shape and motion)

Motion segmentation in low texture with low texture, segmentation fails ! • Two-frame motion-based segmentation • No prior knowledge about shape, texture • [Diehl, 91] time consumingalgorithms ! • Possible solution - smoothing • Statistical regularization [Dubuisson & Jain, 95] • Combine motion with other attributes [Bouthemy & François, 93] • Proposed approach - exploit rigidity over a set of frames • Explicit modeling of occlusion • Feasible implementation of MLE

Observation model background camera window camera position camera position object template (modeling of oclusion) object position object texture noise

Maximum Likelihood estimation • Given • set of F frames • Estimate • background texture • object texture • object template • camera motion • object motion • ML cost function over all frames and pixels • ML estimate

Minimization procedure • ML estimation quadratic in O and B average of the observations, after registration • Object and background estimates linear in T average of the observations, in the regions not occluded by the object nonlinear in T • Decouple the estimation of the position vectors • Motion is estimated on a frame by frame basis [Bergen et al, 92]

Minimization procedure - two-step iterative method • Replacing and in the ML cost function nonlinear minimization ! • Replacing only in the ML cost function • minimize using a two-step iterative method: • solve for with fixed • solve for with fixed (quadratic, closed-form solution) (linear, closed-form solution)

Minimization procedure - segmentation matrix Segmentation matrix • Template estimate • Replacing only in the ML cost function Accumulated differences between each pair of co-registered frames Accumulated differences between each frame and the background • regions where the test is inconclusive with the available F frames linear in T !

Experiment moving object three frames from the image sequence background

Experiment background estimate Two-step method template estimate

Experiment background estimate moving objects four frames from a video sequence

3D structure from 2D video • Motivation: 3D content-based video representation (application areas go well behind digital video) • Key step: recovery of 3D shape and 3D motion from an image sequence • Strongest cue: motion of the brightness pattern • Structure From Motion: • Step 1. Compute the 2D motion on the image plane • Step 2. Recover the 3D motion and the depth

Two-frame SFM - common problem • step 1. track feature points across a set of frames • step 2. recover relative depth and set of 3D positions • Two-frame SFM failswhen object is far from camera 3D • Solution: exploit rigidity - multi-frame SFM • Multi-frame Structure From Motion:

Factorization method expedite method • Factorization [Tomasi & Kanade, 92]: • uses linear subspace constraints • 3D structure is estimated by factorizing a measurement matrix R whose entries are the trajectories of feature point projections • without noise, R is rank 3. AnSVD is used to factorize matrix R • Multi-frame SFM - hard problem: • non-linear • large set of unknowns (due to the entire set of 3D positions) • Problems: • track a large set of features: computationally very heavy, if possible • cost of SVD: high for large number of features or frames

Proposed approach: surfaced-based factorization • Induces a parametric description for the 2D motion in the image plane • Recover the 3D shape and 3D motion parameters from the 2D motion parameters by further exploiting linear subspace constraints: • surface-based factorization • rank 1 factorization • weighted factorization uses a fast algorithm to compute only the largest singular value computes the weighted estimate without additional computational cost • Describe the 3D shape by a local parameterization

Maximum Likelihood formulation rather than the two components of the motion, local depth is a single unknown • Observations: the images in the sequence. Unknowns: object texture, 3D shape, 3D motion • Through ML, 3D structure is recovered: • Exploiting object rigidity over a set of frames • Directly from the image intensity values so, where do SFM and factorization come from ? • Minimization procedure : • Minimize with respect to the texture in terms of 3D shape and 3D motion • After replacing the texture estimate, the ML cost function depends on the 3D structure only through the 2D motion in the image plane • Estimate 3D motion by inferring SFM (factorization). Plug-in the 3D motion estimates • Minimize the ML cost function with respect to the relative depth • Local 2D motion estimation is ill-posed - aperture problem. Direct methods: • Infer 3D structure by using the brightness change constraintbetween two frames [Horn & Weldon, 88] • Kalman filter to update estimates over time [J. Hell, 90]

Observation model • Observation model texture shape 3D position • Unknowns:

Texture estimate • Texture estimate - weighted average • ML estimate

SFM as an approximation to MLE • The ML cost function depends on the 3D structure only through the 2D motion induced in the image plane (no approximations involved) • Insert the texture estimate into the cost function • 3D structure estimation: • 3D motion estimation: • Compute 2D motion • SFM: rank 1 surface-based factorization • 3D shape estimation: • Plug-in the 3D motion estimate into the ML cost function • Then, minimize with respect to the shape • (The estimates can be refined by minimizing the ML cost function in two alternate steps, • but initialization is the key problem)

Feature-based SFM Translation estimate: Define:

Rank 1 factorization • Decomposition (minimize without constraints) Define: • Normalization (computes by approximating the constraints) Define:

Rank 1 factorization - experiment three larger singularvalues of R matrix is well described by its largest singular value

Rank 1 factorization - experiment all trajectories have equal shape - it depends only on the 3D motion. The scaling factor depends on the 3D shape (relative depth) 3D shape and 3D motion are observed in a coupled way through the feature trajectories

Surface-based factorization • Orthographic projection (easily extended to scaled-orthographic and para-perspective projections) • 2D motion in the image plane is affine Relation between the parameters: • Rank 1 factorization Multi-frame SFM: • Piecewise planar 3D shapes

Surface-based factorization - experiment smooth texture image motion parameters image sequence

Surface-based factorization - experiment motion shape

Weighted factorization observation noise • rank 1 factorization

Weighted factorization - experiment non-weighted estimates weighted estimates two components of translation six entries of the rotation matrix feature trajectories

Feature trajectories

Non-weighted factorization - reconstruction

Weighted factorization - reconstruction

ML estimate of the 3D shape • Image motion: known motion parameters affine mapping that depends only on the 3D motion • Define a sequence: • Motion of the affine mapped sequence: unknown relative depth shape of the trajectory of s (known from 3D motion) magnitude of the trajectory of s (unknown relative depth) • Plug-in the 3D motion estimate into the ML cost function • Estimating the relative depth after plugging-in the 3D motion is more constrained than estimating the image motion • Motivation for the minimization procedure

Minimization procedure - multiresolution • Multiresolution continuation-type method • coarse-to-fine as more images are being taken into account • each stage minimizes the ML cost function by using a Gauss-Newton method components of the image gradient • Region R - constant relative depth z

Experiment • Image sequence: • and motion: • Shape

Experiment Affine mapped image sequence: • Shape:

Experiment without smoothing Multiresolution continuation-type method. Shape estimate:

Experiment

Experiment • Synthesizing different views:

Application - video compression Original Compressed 317:1 Compressed 575:1 Texture patches JPEG compressed

Major contributions and extensions • Explicit modeling of occlusion • Multiframe motion segmentation algorithm (two-step) • Surface-based factorization • Rank 1 factorization • Weighted factorization • extension: contour model • extensions: • other projection models • multibody • occlusion • 3D deformable shape from a set of cameras • subspace constraints for image motion estimation • Multiresolution algorithm for direct inference of 3D shape • extension: parameterized surface model

Experiment Multiresolution continuation-type method. Shape estimate:

Experiment

Rank 1 factorization - computational cost

Rigid Structure from Video

Rigid Structure from Video

Presentation Transcript

Structure from motion

Secondary Structure Assignment from Structure

Structure from Motion

Structure from images

Structure from Motion

Structure from motion

Structure From Motion

Structure from motion

Structure from motion

Rigid Polyurethane Foams from Lignopolyol

Structure-from-Motion

Structure from Motion

Structure from Motion

Structure from Motion

Structure from Motion

Structure from motion

Structure from motion

Structure from motion

Role of Rigid Components in Protein Structure

FROM PARTICLE TO RIGID BODY

Structure from motion