260 likes | 472 Views
Laboratory for Information and Decision Systems. Nonparametric Bayesian Learning of Switching Dynamical Processes. Emily Fox, Erik Sudderth, Michael Jordan, and Alan Willsky Nonparametric Bayes Workshop 2008 Helsinki, Finland. Applications. = set of dynamic parameters. Priors on Modes.
E N D
Laboratory for Information and Decision Systems Nonparametric Bayesian Learning of Switching Dynamical Processes Emily Fox, Erik Sudderth, Michael Jordan, and Alan Willsky Nonparametric Bayes Workshop 2008 Helsinki, Finland
= set of dynamic parameters Priors on Modes • Switching linear dynamical processes useful for describing nonlinear phenomena • Goal: allow uncertainty in number of dynamical modes • Utilize hierarchical Dirichlet process (HDP) prior • Cluster based on dynamics Switching Dynamical Processes
Outline • Background • Switching dynamical processes: SLDS, VAR • Prior on dynamic parameters • Sticky HDP-HMM • HDP-AR-HMM and HDP-SLDS • Sampling Techniques • Results • Synthetic • IBOVESPA stock index • Dancing honey bee
Vector autoregressive (VAR) process: Linear Dynamical Systems • State space LTI model:
State space models VAR processes Linear Dynamical Systems • State space LTI model: • Vector autoregressive (VAR) process:
Switching VAR process: Switching Dynamical Systems • Switching linear dynamical system (SLDS):
Group all observations assigned to mode k Define the following mode-specific matrices Place matrix-normal inverse Wishart prior on: Prior on Dynamic Parameters Rewrite VAR process in matrix form: Results in K decoupled linear regression problems
Sticky HDP-HMM Infinite HMM: Beal, et.al., NIPS 2002HDP-HMM: Teh, et. al., JASA 2006Sticky HDP-HMM: Fox, et.al., ICML 2008 • Dirichlet process (DP): • Mode space of unbounded size • Model complexity adapts to observations • Hierarchical: • Ties mode transition distributions • Shared sparsity • Sticky: self-transition bias parameter Time Mode
Mode-specific transition distributions: sparsity of b is shared,increased probability of self-transition Sticky HDP-HMM • Global transition distribution:
HDP-SLDS HDP-AR-HMM and HDP-SLDS HDP-AR-HMM
Blocked Gibbs Sampler Sample parameters • Approximate HDP: • Truncate stick-breaking • Weak limit approximation: • Sample transition distributions: • Sample dynamic parameters using state sequence as VAR(1) pseudo-observations: Fox, et.al., ICML 2008
Blocked Gibbs Sampler Sample mode sequence • Use state sequence as pseudo-observations of an HMM • Compute backwards messages: • Block sample as:
Blocked Gibbs Sampler Sample state sequence • Equivalent to LDS with time-varying dynamic parameters • Compute backwards messages (backwards information filter): • Block sample as: All Gaussian distributions
Hyperparameters • Place priors on hyperparameters and learn them from data • Weakly informative priors • All results use the same settings hyperparameters can be set using the data
HDP-VAR(1)-HMM HDP-VAR(2)-HMM HDP-HMM HDP-SLDS Results: Synthetic VAR(1) 5-mode VAR(1) data
HDP-VAR(1)-HMM HDP-VAR(2)-HMM HDP-HMM HDP-SLDS Results: Synthetic AR(2) 3-mode AR(2) data
HDP-VAR(1)-HMM HDP-VAR(2)-HMM HDP-HMM HDP-SLDS Results: Synthetic SLDS 3-mode SLDS data
sticky HDP-SLDS non-sticky HDP-SLDS ROC Results: IBOVESPA Daily Returns • Data: Sao Paolo stock index • Goal: detect changes in volatility • Compare inferred change-points to 10 cited world events Carvalho and Lopes, Comp. Stat. & Data Anal., 2006
x-pos y-pos • Observation vector: • Head angle (cosq, sinq) • x-y body position sinq cosq Results: Dancing Honey Bee • 6 bee dance sequences with expert labeled dances: • Turn right (green) • Waggle (red) • Turn left (blue) Sequence 1 Sequence 2 Sequence 3 Sequence 4 Sequence 5 Sequence 6 Time Oh et. al., IJCV, 2007
Nonparametric approach: Model: HDP-VAR(1)-HMM Set hyperparameters Unsupervised training from each sequence Infer: Number of modes Dynamic parameters Mode sequence Supervised Approach [Oh:07]: Model: SLDS Set number of modes to 3 Leave one out training: fixed label sequences on 5 of 6 sequences Data-driven MCMC Use learned cues (e.g., head angle) to propose mode sequences Results: Dancing Honey Bee Oh et. al., IJCV, 2007
HDP-AR-HMM: 83.2%SLDS [Oh]: 93.4% HDP-AR-HMM: 93.2%SLDS [Oh]: 90.2% HDP-AR-HMM: 88.7%SLDS [Oh]: 90.4% Results: Dancing Honey Bee Sequence 4 Sequence 5 Sequence 6
HDP-AR-HMM: 46.5%SLDS [Oh]: 74.0% HDP-AR-HMM: 44.1%SLDS [Oh]: 86.1% HDP-AR-HMM: 45.6%SLDS [Oh]: 81.3% Results: Dancing Honey Bee Sequence 1 Sequence 2 Sequence 3
Conclusion • Examined HDP as a prior for nonparametric Bayesian learning of SLDS and switching VAR processes. • Presented efficient Gibbs sampler • Demonstrated utility on simulated and real datasets