350 likes | 364 Views
Segmentation of Dynamic Scenes and Textures via Generalized Principal Component Analysis. René Vidal Center for Imaging Science Institute for Computational Medicine Johns Hopkins University. Data segmentation and clustering. Given a set of points, separate them into multiple groups
E N D
Segmentation of Dynamic Scenes and Textures via Generalized Principal Component Analysis René Vidal Center for Imaging Science Institute for Computational Medicine Johns Hopkins University
Data segmentation and clustering • Given a set of points, separate them into multiple groups • Discriminative methods: learn boundary • Generative methods: learn mixture model, using, e.g. Expectation Maximization
Dimensionality reduction and clustering • In many problems data is high-dimensional: can reduce dimensionality using, e.g. Principal Component Analysis • Image compression • Recognition • Faces (Eigenfaces) • Image segmentation • Intensity (black-white) • Texture
Segmentation problems in dynamic vision • Segmentation of video and dynamic textures • Segmentation of rigid-body motions
Clustering data on non Euclidean spaces • Clustering data on non Euclidean spaces • Mixtures of linear spaces • Mixtures of algebraic varieties • Mixtures of Lie groups • “Chicken-and-egg” problems • Given segmentation, estimate models • Given models, segment the data • Initialization? • Need to combine • Algebra/geometry, dynamics and statistics
Previous and ongoing work Nonlinear/Kernel PCA (Scholkopf-Smola-Muller ’98) Apply PCA to data embedded into higher dimensional space Clustering linear subspaces Mixtures of PCA (Tipping et al. ’99) Generalized PCA (Vidal et al. ‘03) Nonlinear dimensionality reduction techniques LLE (Roweis and Saul ’00) Isomap (Tenenbaum ’00) Multilinear clustering Bilinear (Vidal-Ma ’04, Rao et al.‘05) Trilinear (Hartley-Vidal ‘04)
Talk outline • Generalized Principal Component Analysis (GPCA) • Segmentation of Dynamic Scenes • Segmentation of Dynamic Textures
Generalized Principal Component Analysis • Given points on multiple subspaces, identify • The number of subspaces and their dimensions • A basis for each subspace • The segmentation of the data points • A union of n subspaces can be represented with a set of homogeneous polynomials of degree n
Generalized Principal Component Analysis Veronese map • Polynomials can be expressed linearly in terms of a set of coefficients by using a polynomial embedding called Veronese map
Generalized Principal Component Analysis • Coefficients of the polynomial can be computed from null space of embedded data matrix • Solve using least squares • N = #data points • Number of subspaces can be computed from the rank of embedded data matrix • Normal to the subspaces can be computed from the derivatives of the polynomial
Generalized Principal Component Analysis • With noise and outliers • Polynomials may not be a perfect union of subspaces • Normals can estimated correctly by choosing points optimally • Distance to closest subspace without knowing segmentation?
Subspaces of different dimensions There are multiple polynomials fitting the data The derivative of each polynomial gives a different normal vector Can obtain a basis for the subspace by applying PCA to normal vectors
Dealing with high-dimensional data Minimum number of points K = dimension of ambient space n = number of subspaces In practice the dimension of each subspace ki is much smaller than K Number and dimension of the subspaces is preserved by a linear projection onto a subspace of dimension Can remove outliers by robustly fitting the subspace Open problem: how to choose projection? PCA? Subspace 1 Subspace 2
GPCA: Algorithm summary • Apply polynomial embedding to projected data • Obtain multiple subspace model by polynomial fitting • Solve to obtain • Need to know number of subspaces • Obtain bases & dimensions by polynomial differentiation • Optimally choose one point per subspace using distance
3-D motion segmentation problem • Given a set of point correspondences in multiple views, determine • Number of motion models • Motion model: affine, homography, fundamental matrix, trifocal tensor • Segmentation: model to which each pixel belongs • Mathematics of the problem depends on • Number of frames (2, 3, multiple) • Projection model (affine, perspective) • Motion model (affine, translational, homography, fundamental matrix, etc.) • 3-D structure (planar or not) • All cases can be solved using real or complex GPCA (ECCV’04)
Motion segmentation: multiple affine views Structure = 3D surface • Affine camera model • p = point • f = frame • Motion of one rigid-body lives in a 4-D subspace (Boult and Brown ’91, Tomasi and Kanade ‘92) • P = #points • F = #frames Motion = camera position and orientation
Motion segmentation: multiple affine views We tested GPCA on Sequences with up to 30% missing data Sequences with full and degenerate motions Full motions Linear motions Planar motions Sequences withperspective effects Sequences withtransparent motions Sequences withminimum #frames
Motion segmentation: multiple affine views • Sequence A Sequence B Sequence C • Percentage of correct classification
Modeling non-moving dynamic textures A + z z v = 1 + t t t C + y z w = t t t • Examples of dynamic textures: • Model temporal evolution as the output of a linear dynamical system (LDS): Soatto et al. ‘01 dynamics images appearance
Modeling moving dynamic textures • Can we recover the rigid motion of a camera looking at a moving dynamic texture? • Can we segment a scene containing multiple dynamic textures? • Can we segment a scene containing multiple moving dynamic textures? DTCC: Dynamic Texture Constancy Constraint GPCA: Generalized Principal Component Analysis GPCA + DTCC
Modeling moving dynamic textures A A + + z z z z v v = = 1 1 + + t t t t t t ( ) C C t + + y y z z w w = = t t t t t t • A time invariant model cannot capture camera motion • A rigid scene is a 1st order LDS • A = 1, zt = 1, yt = C = constant • Camera motion can be modeled with a time varying LDS • A models nonrigid motion and C models rigid motion dynamics images appearance
Optical flow of a moving dynamic texture Static textures: optical flow from brightness constancy constraint (BCC) Dynamic textures: optical flow from dynamic texture constancy constraint (DTCC) I I I 0 + + u v = t x y
Segmenting non-moving dynamic textures 2 3 C C A 6 7 £ ¤ £ ¤ 6 7 ¢ ¢ ¢ ¢ ¢ ¢ y y z z = F F 0 0 . 6 7 water . 4 5 . F 1 ¡ C A A + z z v = 1 + t t t steam C + y z w = t t t • One dynamic texture lives in the observability subspace • Multiple textures live in multiple subspaces • Cluster the data using GPCA
Segmenting non-moving dynamic textures • Doretto et al. (ICCV’03) • Ghoreyshi & Vidal (ECCV’06) Ocean-dynamics Ocean-appearance Ocean-smoke
Segmenting moving dynamic textures Ocean-bird
Segmenting moving dynamic textures Doretto et al. (ICCV’03) Ghoreyshi & Vidal (WDV’06) Ocean-fire
Segmenting of moving dynamic textures • Results on a real sequence Raccoon on River
Conclusions and Future Work • Many problems in computer vision can be posed as subspace clustering problems • Temporal video segmentation • 2-D and 3-D motion segmentation • Dynamic texture segmentation • Nonrigid motion segmentation • These problems can be solved using GPCA: an algorithm for clustering subspaces • Deals with unknown and possibly different dimensions • Deals with arbitrary intersections among the subspaces • GPCA is based on • Projecting data onto a low-dimensional subspace • Recursively fittingpolynomials to projected subspaces • Differentiating polynomials to obtain a basis
Collaborators Australian National University Richard Hartley Université des Sciences et Technologies de Lille Camille Izard Università di Siena Jacopo Piazzi • Johns Hopkins University • Atiyeh Ghoreyshi • Avinash Ravichandran • Dheeraj Singaraju • University of Illinois • Yi Ma • UCLA • Stefano Soatto • UC Berkeley • Shankar Sastry
Thanks • 2005 NSF CAREER "Recognition of Dynamic Activities in Unstructured Environments" • 2005 NSF "An Algebraic Geometric Approach to Hybrid Systems Identification“ • 2005 ONR “Segmenting Rigid Motions from Dynamic Textures” • 2005 JHU “Advanced Video Exploitation for Unmanned Aerial Vehicles” • 2004 NIH “Magnetic Resonance Guided Electrophysiology Intervention”
For more information, Vision Lab @ Johns Hopkins University Thank You!