470 likes | 686 Views
Learning Dynamic Models from Unsequenced Data Jeff Schneider School of Computer Science Carnegie Mellon University. joint work with Tzu-Kuo Huang, Le Song. Learning Dynamic Models. Hidden Markov Models e.g. for speech recognition Dynamic Bayesian Networks e.g. for protein/gene interaction
E N D
Learning Dynamic Models from Unsequenced Data Jeff Schneider School of Computer Science Carnegie Mellon University joint work with Tzu-Kuo Huang, Le Song
Learning Dynamic Models Hidden Markov Models e.g. for speech recognition Dynamic Bayesian Networks e.g. for protein/gene interaction System Identification e.g. for control [source: Wikimedia Commons] • Key Assumption: SEQUENCED observations • What if observations are NOT SEQUENCED? [source: SISL ARLUT] Hubble Ultra Deep Field [source: UAV ETHZ] [Bagnell & Schneider, 2001]
When are Observations not Sequenced? • Galaxy evolution • dynamics are too slow to watch • Slow developing diseases • Alzheimers • Parkinsons • Biological processes • measurements are often destructive [source: STAGES] [source: Getty Images] How can we learn dynamic models for these? [source: Bryan Neff Lab, UWO]
Outline • Linear Models • [Huang and Schneider, ICML, 2009] • Nonlinear Models • [Huang, Song, Schneider, AISTATS, 2010] • Combining Sequence and Unsequenced Data • [Huang and Schneider, NIPS, 2011]
Problem Description Estimate A from the sample of xi’s
A Maximum Likelihood Approach suppose we knew the dynamic model and the predecessor of each point …
Likelihood (continued) • we don’t know the time either so also integrate out over time • then use the empirical density as an estimate for the resulting marginal distribution
Sample Synthetic Result input output
Partial Order Approximation (PM) • Perform estimation by alternating maximization • Replace UM's E-step with a maximum spanning tree on the complete graph over data points • weight on each edge is probability of one point being generated from the other given A and s • enforces a global consistency on the solution • M-step is unchanged: weighted regression
Learning Nonlinear Dynamic Models [Huang, Song, Schneider, AISTATS, 2010]
Learning Nonlinear Dynamic Models • An important issue • Linear model provides a severely restricted space of models • we know a model is wrong because the regression yields large residuals and low likelihoods • The nonlinear models are too powerful; they can fit anything! • Solution: restrict the space of nonlinear models • form the full kernel matrix • use a low-rank approximation of the kernel matrix
Synthetic Nonlinear Data: Lorenz Attractor Estimated gradients by kernel UM
Methods for Real Data • Run k-means to cluster the data • Find an ordering of the cluster centers • TSP on pairwise L1 distances (TSP+L1) • OR • Temporal Smoothing Method (TSM) • Learn a dynamic model for the cluster centers • Initialize UM/PM with the learned model
Cosine score in high dimensions Probability of random direction achieving a cosine score > 0.5 dimension
Suppose we have some sequenced data linear dynamic model: perform a standard regression: what if the amount of data is not enough to regress reliably?
Regularization for Regression add regularization to the regression: ridge regression: lasso: can the unsequenced data be used in regularization?
Lyapunov Regularization Lyapunov equation relates dynamic model to steady state distribution: Q – covariance of steady state distribution • estimate Q from the unsequenced data! • optimize via gradient descent using the unpenalized or the ridge regression solution as the initial point
Lyapunov Regularization: Toy Example -0.428 0.572 -1.043 -0.714 s = 1 A = • 2-d linear system • 2nd column of A fixed at the correct value • given 4 sequence points • given 20 unsequenced points
Results on Synthetic Data Random 200 dimensional sparse (1/8) stable system
Work in Progress … • cell cycle data from: [Zhou, Li, Yan, Wong, IEEE Trans on Inf Tech in Biomedicine, 2009] • 49 features on protein subcellular location • 34 sequences having a full cycle and length at least 30 were identified • another 11,556 are unsequenced • use the 34 sequences as ground truth and train on the unsequenced data A set of 100 sequenced images A tracking algorithm identified 34 sequences
Preliminary Results: Protein Subcellular Location Dynamics normalized error cosine score
Conclusions and Future Work • Demonstrated ability to learn (non)linear dynamic models from unsequenced data • Demonstrated method to use sequenced and unsequenced data together • Continuing efforts on real scientific data • Can we do this with hidden states?