1 / 35

Modeling molecular dynamics from simulations

Modeling molecular dynamics from simulations . Nina Singhal Hinrichs Departments of Computer Science and Statistics University of Chicago January 28, 2009. Motivation. Proteins are essential parts of living organisms enzymes, cell signaling, membrane transport . . .

ely
Download Presentation

Modeling molecular dynamics from simulations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling molecular dynamics from simulations Nina Singhal Hinrichs Departments of Computer Science and Statistics University of Chicago January 28, 2009

  2. Motivation • Proteins are essential parts of living organisms • enzymes, cell signaling, membrane transport . . . • Composed of chain of amino acids • Fold to unique 3-dimensional structure • Misfolding can cause diseases • Alzheimer’s, Mad cow, Huntington’s . . . • How do proteins fold?

  3. Molecular dynamics • Represent atoms of molecule and solvent • Model forces on atoms • Integrate laws of motion • Small integration time step compared to motion timescales

  4. Folding@Home: Distributed computing for biomolecular simulation • Perform multiple simulations in parallel • Total simulation times – hundreds of microseconds (hundreds of CPU-years) • Very powerful computational resource • ~200 Teraflops sustained performance • >1,000,000 total CPUs; 200,000 active

  5. Challenge: How to analyze? • Enormous datasets • Describe dynamics in microscopic detail • Questions we want to answer • Rate of folding, mechanism of folding . . . • How can we extract these properties from our data?

  6. Outline • Markovian state model for molecular motion • Model description, uses, examples • New algorithms for building these models • Defining states and transition probabilities • New methods for dealing with finite sampling • Model complexity, uncertainty analysis, targeted sampling

  7. Chemical intuition Chemical reactions often exhibit stochastic behavior n-butane Chandler, Journal of Chemical Physics (1977)

  8. Markovian state model Define states in the conformation space 1 1 5 5 3 3 2 2 4 4 Define transition probabilities, or edges, between states

  9. Uses of the model Chodera et al., Multiscale Modeling and Simulation (2006) • Populations of states over time • Eigenvalues and eigenvectors – conformational changes • Kinetic properties – virtually any kinetic property • Mechanistic properties – most likely path, probability of transitions as graph algorithms p t

  10. Example models Kasson et al., PNAS (2006) alanine peptide lipid vesicle fusion Chodera et al., Multiscale Modeling and Simulation (2006) alpha helix villin headpiece Sorin and Pande, Biophysical Journal (2005) Jayachandran et al., Journal of Structural Biology (2006)

  11. 1 1 5 5 3 3 2 2 4 4 l Computational and statistical challenges • Building Markovian state model • Defining states that are Markovian • Calculating the transition probabilities • Refining Markovian state model • Finding the best model • Determining model uncertainty • Designing new simulations

  12. Automatic state decomposition • Building Markovian State Model • Defining states that are Markovian • Calculating the transition probabilities • Challenge: Find appropriate states • Individual conformations as states does not scale • Group conformations into discrete states • Structural clustering is insufficient • Basic algorithm – combine structural and kinetic similarity J. D. Chodera*, N. Singhal*, V. S. Pande, K. A. Dill, and W. C. Swope. Automatic discovery of metastable states for the construction of Markov models of macromolecular conformational dynamics. Journal of Chemical Physics, 126, 155101 (2007). (*These authors contributed equally to this work)

  13. Comparison of structural and kinetic clustering trpzip2 Cochran et al. PNAS 98:5578, 2001. structural clustering kinetic clustering

  14. State decomposition – splitting Cluster conformations by root mean square distance (RMSD)

  15. State decomposition – lumping group states which inter-convert quickly

  16. State decomposition – resplitting Cluster conformations, restricted to each state

  17. 2 1 6 3 4 5 Blocked alanine peptide 60 y -60 Chodera et al., Multiscale Modeling and Simulation (2006) -60 f 60

  18. These conformations had an unusual peptide bond Automatic state decomposition of alanine peptide Black state sits on top of multiple other states! y Benefit of automatic algorithm f

  19. Stability of decomposition

  20. TrpZip peptide

  21. 1 5 3 2 4 Count number of transitions between all pairs of states normalize transition counts transition probabilities Transition probabilities • Building Markovian State Model • Defining states that are Markovian • Calculating the transition probabilities Discretize trajectories into series of states 1223435 N. Singhal, C. D. Snow, and V. S. Pande. Using path sampling to build better Markovian state models: Predicting the folding rate and mechanism of a trp zipper beta hairpin. Journal of Chemical Physics, 121(1), 415-425 (2004).

  22. Refining Markovian State Model • Finding the best model • Determining model uncertainty • Designing new simulations Model selection • Challenge: How many states should we have? • More states are more Markovian • More states have more parameters • How do we evaluate this tradeoff? N. S. Hinrichs and V. S. Pande. Bayesian metrics for validating and improving Markovian state models for molecular dynamics simulations. (In preparation)

  23. Hidden Markov Model formulation • Formulate the problem as a Hidden Markov Model structure scoring question • Different discretizations of continuous space • Benefits of Bayesian scores • Naturally handles tradeoff between complexity of model and amount of data • Avoids over-fitting of parameters States Observations

  24. Alanine peptide results Score of Hidden Markov models for different lag times Last model is worse at shorter times but preferred at longer times No previous evaluation methods could distinguish these models

  25. Refining Markovian State Model • Finding the best model • Determining model uncertainty • Designing new simulations Uncertainty analysis Goal: Once we have the states, what is the uncertainty in the model? Uncertainty caused by finite sampling 1 1 5 5 3 3 2 2 4 4 Both are reasonable but give different transition probabilities  Different MFPT, Pfold, eigenvalues, eigenvectors ... N. Singhal and V. S. Pande. Error analysis and efficient sampling in Markovian state models for protien folding. Journal of Chemical Physics, 123, 204909-204921 (2005). N. S. Hinrichs and V. S. Pande. Calculation of the distribution of eigenvalues and eigenvectors in Markovian state models for molecular dynamics. Journal of Chemical Physics, 126, 244101 (2007).

  26. j 70 i 700 j 30 k i 300 k Transition probabilities Recall that we calculate transition probabilities by counting: • Instead of getting a single value, we can talk about the distribution of transition probabilities • Bayes’ Rule: pij

  27. [pij] [pij] [pij] solveforeigenvalue solveforeigenvalue solveforeigenvalue l l l Sampling approach Possible solution to get distribution of eigenvalues: Problem: sampling can be expensive solving per sample can be expensive

  28. Closed-form solution Idea: trade exact distribution for efficient approximation Eigenvalue equation: efficient to calculate using adjoint systems Taylor series expansion: Multivariate normal approximation of Dpi*  Closed-form normal distribution for l

  29. Running times (87 states) Sampling-based: 3600 seconds Closed-form: < 0.07 seconds Uncertainty results 5000 trajectories from each state 2 1 6 4 3 5 Alanine System Transition Counts Running times (6 states) Sampling-based: 40 seconds Closed-form: < 0.01 seconds

  30. Refining Markovian State Model • Finding the best model • Determining model uncertainty • Designing new simulations Sampling strategies • Problem: Simulations are expensive. Even with Folding@Home, we run simulations for months • How to intelligently allocate our resources? • Common approaches: • equilibrium sampling – sample each conformation from its equilibrium distribution • even sampling – sample equally from each state • New sequential approaches N. Singhal and V. S. Pande. Error analysis and efficient sampling in Markovian state models for protien folding. Journal of Chemical Physics, 123, 204909-204921 (2005). N. S. Hinrichs and V. S. Pande. Calculation of the distribution of eigenvalues and eigenvectors in Markovian state models for molecular dynamics. Journal of Chemical Physics, 126, 244101 (2007).

  31. Adaptive sampling Goal: Reduce uncertainty of eigenvalue Uncertainty analysis decomposes by transitions from each state Variance depends on both uncertainty of and sensitivity to transition probabilities

  32. Adaptive sampling – alanine On 6-state alanine system, select trajectories randomly for 3 sampling strategies Transition Counts

  33. Adaptive sampling – villin 2454 states • Benefits • Very quickly reduce the variance • Reduce the total number of simulations • Need less computational power • Can study more complex systems Villin Headpiece Jayachandran, et al., Journal of Chemical Physics (2006)

  34. Summary • Markovian state models are convenient methods to describe molecular motion • Automatic state decomposition • Scalable to large size systems • Model selection • Evaluate tradeoff between model complexity and amount of data • Uncertainty analysis • Efficient and decomposable • Adaptive sampling • Reduce number of simulations

  35. Acknowledgements • Vijay Pande – Stanford University adviser • Bill Swope, Jed Pitera – IBM collaborators • John Chodera – state decomposition work

More Related