280 likes | 309 Views
MODELING AND RECOGNITION OF DYNAMIC VISUAL PROCESSES. Rene Vidal and Stefano Soatto UC Berkeley UCLA. Overview. Motivation: modeling dynamic visual processes for recognition Review of results for stationary processes
E N D
MODELING AND RECOGNITIONOF DYNAMIC VISUAL PROCESSES Rene Vidal and Stefano Soatto UC Berkeley UCLA
Overview • Motivation: modeling dynamic visual processes for recognition • Review of results for stationary processes • Dynamic textures: modeling, synthesis, classification • Human gaits: modeling with HOS, recognition • Extension to hybrid models • Jump-Markov systems • Lack of uniqueness guarantees (inference) • Analysis of the observability and identifiability of jump-linear systems
Motivation: modeling dynamic visual processes • Visual data not sufficient to recover the correct (Euclidean) geometry, arbitrary (non-Lambertian) photometry and (non-linear) dynamics! • In vision, use assumptions on some unknowns to recover the others (e.g. photometric invariants to recover geometric invariants – shape of rigid objects). Assumptions cannot be validated. • When assumptions are violated, what kind of model can we retrieve? REPRESENTATION depends on what TASK the model is used for.
Modeling dynamic visual processes for classification • HP: is (second-order) stationary (simplest case) • Images (realizations of a stochastic process) • Recover a model • Model should be • Generative (reproduce the statistics) • Predictive (allow extrapolation)
DYNAMIC IMAGE MODELS Response • Spatial filters • Receptive fields • ICA/PCA • Wavelets • … • is the output of a dynamical system driven by IID process
DYNAMIC IMAGE MODELS • is second-order stationary • Stochastic realization + details (spectral factorization, innovation form)
UNIQUENESS OF REPRESENTATION “learning” (identification) • Equivalence class of models (basis of state-space) • Canonical realization
Learning (ID) • Nonlinear (bi-linear) • Typically: E-M • (P) Global convergence not guaranteed • (P) Convergence to equivalent class; cannot use for recognition • Use Subspace Methods [Van Overschee, De Moor ’95]
BASIC IDEA • (P) Does not take into account dynamics • State is the n-rank approximation of data that makes future conditionally independent of past (canonical correlations) • Look for “best” (F) n-dimensional subspace of past that predicts future (Subspace ID) • HP: state is reconstructible (WLOG in a dimensionality reduction scenario) in one step
SUBSPACE IDENTIFICATION • Closed-form, unique, asymptotically efficient (maximum likelihood)
WHAT CAN WE DO WITH A MODEL? Compression(maximize mutual information)
WHAT CAN WE DO WITH A MODEL? Extrapolation
WHAT CAN WE DO WITH A MODEL? Synthesis Learning = 3 min in Matlab Synthesis = instantaneous
RECOGNITION ? • Given samples of “water”, “foliage”, “steam” • Given new sample, classify it • What is the “average” model? • Can “uncertainty” be inferred from data?
RECOGNITION • Given samples of “water”, “foliage”, “steam” • Given new sample: • What is the “average” model? • Probability distribution on Stiefel manifold • “distance” between two models? ?
Langevin distributions(also Gibbs, Fisher) [See also Jupp & Mardia, ’00]
Langevin distributions(also Gibbs, Fisher) • Likelihood ratios: compute from data (ML) • Easy for : • ??
DISTRIBUTION-INDEPENDENT DECISIONS • Compute distances between models: length of geodesic connecting them
DISTRIBUTION-INDEPENDENT DECISIONS • Compute distances between models: length of geodesic connecting them • Canonical metric • Geodesic trajectories
COMPARING models (measuring distances, computing statistics/likelihood ratios, uncertainty descriptions): ???[Martin ’00, DeKoch-DeMoor ’00]. • Also, Robust control techniques [Mazzaro, Camps, Sznaier, Bissacco, Soatto, 2002]
Walking Learn A,B,C,q Data Model A,B,C,q x(0)=xo Synthesis
Running Learn A,B,C,q Gait Data Model A,B,C,q x(0)=xo Synthesis
Limping Synthesis Data
EXTENSIONS • Nonlinear dynamic textures • Higher-order statistics (dynam-ICA) • First step: Jump-linear systems