10 likes | 124 Views
…. factor N. factor 1. factor 2. data. pose. subject 1. subject 1. subject 2. subject 2. subject 3. subject 3. ?. ?. stride. stride. ?. run. run. walk. walk. Multifactor Gaussian Process Models for Style-Content Separation
E N D
… factor N factor 1 factor 2 data pose subject 1 subject 1 subject 2 subject 2 subject 3 subject 3 ? ? stride stride ? run run walk walk Multifactor Gaussian Process Models for Style-Content Separation Jack M. Wang, David J. Fleet, Aaron HertzmannDepartment of Computer Science, University of Toronto, Canada {jmwang, fleet, hertzman}@cs.toronto.edu http://www.dgp.toronto.edu/~jmwang/gpsc Gaussian processes (GP) Suppose we have a 1D output y given by a linear combination of basis functions of input x: If we assume an isotropic Gaussian prior on the model parameters , then y | x is zero-mean Gaussian with covariance: The covariance is a kernel function, which fully specifies the regression model. Popular choices of kernel functions: Bayesian linear regression model: RBF regression: Application: A locomotion model We focus on periodic human locomotion, and model each pose in a motion sequence as arising from the interaction of three factors: We draw on experience from motion interpolation and apply linear kernels for the style parameters (s, g) and use a RBF kernel to model content (x). In addition, θdmodels the different variance in each of the output degrees of freedoms. This defines a Gaussian process for each pose DOF, which is assumed to be independent conditioned on the inputs. The inputs are learned by maximizing the GP-LVM likelihood function (Lawrence, 2005). Motion synthesis Given new style parameters a Gaussian process prediction distribution is defined w.r.t. content. Running dynamics forward in content space generates motions in new style. Introduction Using prior models of human motion to constrain the inference of 3D pose sequences is a popular approach to improve monocular people tracking, as well as to simplify the process of character animation. The availability of motion capture devices in recent years enables such models to be learned from data, and learning models that generalize well to novel motions has become a major challenge. s: identity of the subject performing the motion g: gait of the motion (walk, run, stride) x:current state of the motion (evolves w.r.t. time) One of the main difficulties in this domain is that the training data and test data typically come from related but distinct distributions. For example, we would often like to learn a prior model of locomotion of locomotion from the motion capture data of a few individuals performing a few gaits (i.e., walking and running). Such a prior model would then be used to tack a new individual or to generate plausible animations of a related, but new gait not included in the training database. Due to the natural variations in how different individuals perform different gaits – which we broadly refer to as style – learning a model that can represent and generalize to the space of human motions is not straightforward. Nonetheless, it has long been observed that interpolating motion capture data yields plausible new motions, and we attempt to build motion models that can generalize in style. Time-series prediction linear kernels RBF kernel Given n frames of motion and a learned dynamical model, predict the next k frames Without model of style, test data must be in same style as training data (i.e., same person, moving in the same gait) Tested single and multifactor versions of: • GPDM (Wang et al., 2006) • B-GPDM (Urtasun et al., 2006) Conclusion: multifactor models improve prediction results. Multifactor GPs Suppose now we wish to model different mappings for different styles. We will add a latent style vector s along with x, and define the following mapping, in which the output depends linearly on style for fixed x: where each gi(x) is a mapping with weight vector wi, and ε represents additive i.i.d. Gaussian noise with zero mean and variance β-1. Fixing just the input sspecializes the mapping to a specific style. If we hold fix x and s, then the output is again a zero-mean Gaussian, with covariance: The covariance function is a product of kernel functions of each sets of factors, respectively. This generalizes to M factors: suppose we wish to model the effects of = {x(1), … , x(M)} on the output. Under the Gaussian priors on model parameters, the covariance function is given by Latent factors We use 6 training sequences, 314 frames of data. They are performed by 3 subjects in 3 gaits with some combinations missing. Poses from the same sequence share the same style factors. In particular, poses from the same row share the same gait vector (g), poses from the same column share the same subject vector (s). Since the motions are periodic, we constrain the content factor (x) to lie on a 2D circle. We do not assume poses are time-warped to match in the content space. Instead, we parameterize each sequence by θ and Δθ, they are learned from data. Approach We introduce a multifactor model for learning distributions of styles of human motion. We parameterize the space of human motion styles by a small number of low-dimensional factors, such as identity and gait, where the dependence on each individual factor may be nonlinear. Our multifactor Gaussian process model can be viewed as a special class of the Gaussian process latent variable model (GP-LVM) (Lawrence, 2005), as well as a Bayesian generalization of multilinear models (Vasilescu & Terzopoulos, 2002). gait, phase,identity,gender, etc… Summary Proposed a Bayesian multifactor model for style-content separation by generalizing multilinear models using Gaussian processes. The model is specified by a product kernel, where each factor is kernelized separately. We learned a locomotion model from very little data, capturing stylistic variations. Explicit models of stylistic variations improve dynamical prediction results on the GPDM. where For example, previously: Multilinear analysis (Vasilescu & Terzopoulos, 2002) x2 x1 x3 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAAAA