660 likes | 823 Views
Bioinformatics Data Analysis & Tools. Molecular simulations & sampling techniques. Molecular Simulations: Brief History. Protein flexibility. Also a correctly folded protein is dynamic Crystal structure yields average position of the atoms ‘Breathing’ overall motion possible. B-factors.
E N D
Bioinformatics Data Analysis & Tools Molecular simulations & sampling techniques Molecular Simulations & Sampling Techniques
Molecular Simulations: Brief History Molecular Simulations & Sampling Techniques
Protein flexibility • Also a correctly folded protein is dynamic • Crystal structure yields average position of the atoms • ‘Breathing’ overall motion possible Molecular Simulations & Sampling Techniques
B-factors • De gemiddelde beweging van atoom rond gemiddelde positie alpha helices beta-sheet Molecular Simulations & Sampling Techniques
Peptide folding from simulation • A small (beta-)peptide forms helical structure according to NMR • Computer simulations of the atomic motions: molecular dynamics Molecular Simulations & Sampling Techniques
unfolded folded Folding and un-folding in 200 ns all different? how different? Unfolded structures 321 1010 possibilities! Folded structures all the same Molecular Simulations & Sampling Techniques
unfolded folded folding equilibrium depends on temperature Temperature dependence 360 K 350 K 340 K 320 K 298 K Molecular Simulations & Sampling Techniques
unfolded folded folding equilibrium depends on pressure Pressure dependence 2000 atm 1000 atm 1 atm Molecular Simulations & Sampling Techniques
Surprising result • Number of relevant non-folded structures is very much smaller than the number of possible non-folded structures • If the number of relevant non-folded structures increases proportionally with the folding time, only 109 protein structures need to be simulated in stead of 1090 structures • Folding-mechanism perhaps simpler after all… Molecular Simulations & Sampling Techniques
Phase Space • Defines state of classical system of N particles: • coordinates q = (x1, y1, z1, x2, … , zN) • momenta p = (px1, py1, pz1, px2, … , pzN) • One conformation (+ momenta) is one point (p,q) in phase space • Motion is a curved line in phase space • trajectory: (p(t),q(t)) Molecular Simulations & Sampling Techniques
Molecular Motions: Time & Length-scales Molecular Simulations & Sampling Techniques
Newton Dynamics Sir Isaac Newton t t + Dt Molecular Simulations & Sampling Techniques
Classical (Newton) Mechanics • A system has coordinates q and momenta p (= mv): p = ( p1, p2, … , pN ) q = ( q1, q2, … , qN ) • This is called the configuration space. • The total energy can be split into two components: • kinetic energy (K): K(p) = ½ mv2 = ½ p2/m • potential energy (V): V(q) depends on interaction(s) • The potential energy is described by • bonded interactions (e.g. bond stretching, angle bending) • non-bonded interactions (e.g. van der Waals, electrostatic) • Non-bonded interactions determine the conformational variation that we observe for example in protein motions. Molecular Simulations & Sampling Techniques
The Hamilton Function • The Hamiltonian function represents the total energy:H(p,q) = K(p) + V(q) • Is the generalised expression of classical mechanics • In two differential expressions: • Newton equations of motion, but in a very elegant way • Use 'generalised coordinates' (p and q): • can use any coordiate system • e.g., Cartesian coordinates or Euler angles dpdHp = ––– = ––– dtdqk dqdHq = ––– = ––– dtdpk . . Molecular Simulations & Sampling Techniques
Hamilton's Principle • "The time derivative of the integral over the energy ofd ( pq - H(p,q) ) dt = 0 • Hamilton's principle is most fundamental • Newton's equation of motion are only one set of equations that can be derived from Hamilton's principle. • The integral is called the 'action‘, meaning: • If we integrate the trajectory of an object in a configuration space given by positions q and momenta p between time points (integration limits) t1 and t2, then the value of the integral (= the 'action') of a 'real‘ trajectory is a minimum (more precisely an extremum) if compared to all other trajectories. • Example: Why does a thrown stone follow a parabolic trajectory? • If you vary the trajectory and calculate the action, the parbolic trajectory will yield the smallest 'action'. . . Molecular Simulations & Sampling Techniques
Harmonic oscillator: • 1-dimensional motion • 2 dimensions in phase-space: • position (1-dimensional) • momentum (1-dimensional) • analytical solution for integration: • q(t) = b · cos (√k/m · t ) • p(t) = -b·√mk· sin ( √k/m·t ) q(t) p(t) Molecular Simulations & Sampling Techniques
Calculating Averages • Integration of phase space: • 1 particle, 2 values per coordinate (e.g. up, down): • 1*6 degrees of freedom (dof); 26 = 64 points • 2 particles: 2*6 dof; 212 = 4.096 points • 3 particles: 3*6 dof; 218 = 262.144 points • 4 particles: 4*6 dof; 224 = 16.777.216 points • Need whole of phase space ? • only low energy states are relevant Molecular Simulations & Sampling Techniques
Solving Complex systems • No analytical solutions • Numerical integration: • by time (Molecular Dynamics) • by ensemble (Monte-Carlo) • Molecular Dynamics:Numerical integration in time • Euler’s approximation: • q(t + Δt) = q(t) + p(t)/m·Δt • p(t + Δt) = p(t) + m·a(t) ·Δt • Verlet / Leap-frog Molecular Simulations & Sampling Techniques
Features of Newton Dynamics • Newton’s equations: • Energy conservative • Time reversible • Deterministic • Numeric integration by Verlet algrorithm: ‘Simulation’r(t + Dt) ~ 2 r(t) - r(t - Dt) + F(t)/mDt2 [ + 2 O(Dt4) ] • In ‘real’ simulation: Rounding errors (cumulative): not fully reversible no full energy conservation • Coupling to thermal bath re-scaling not fully deterministic • ‘Lyapunov’ instability trajectories diverge Molecular Simulations & Sampling Techniques
Derivation: Verlet • Taylor expansion: • q(t+Δt) = q(t) + q’(t)Δt + 1/2! q’’(t)Δt2 + 1/3! q’’’(t)Δt3 + … • where: q’(t) = v(t) (1st derivative, velocity) • and: q’’(t) = a(t) (2nd derivative, acceleration) q(t+Δt) = q(t) + q’(t)Δt + 1/2! q’’(t)Δt2 + 1/3! q’’’(t)Δt3 q(t−Δt) = q(t) − q’(t)Δt + 1/2! q’’(t)Δt2 − 1/3! q’’’(t)Δt3+ q(t+Δt) + q(t−Δt) = 2q(t) + 2·1/2! q’’(t)Δt2 • Rearrange: q(t+Δt) = 2q(t) − q(t−Δt) + a(t)Δt2 • 2nd order; but 3rd order accuracy Molecular Simulations & Sampling Techniques
What do we obtain? • Trajectory:q(t) and p(t) • Probability of occurence:P(p,q) = 1/Z e-H(p,q)/kT • Averages along trajectory: <A(p,q)T> = 1/T A(q(t),p(t)) dt (where T denotes total time, and not! temperature) Molecular Simulations & Sampling Techniques
Convergence • Amount of phase-space covered • “Sampling” • Impossible to prove:You cannot know what you don’t know • Energy “landscape” in phase-space • there might be a “next valley” Molecular Simulations & Sampling Techniques
Example: Convergence (1) Molecular Simulations & Sampling Techniques
Example: Convergence (2) Molecular Simulations & Sampling Techniques
Example: Convergence (3) • Apparent Convergenceon all timescales100 ps – 10 ns ! Molecular Simulations & Sampling Techniques
Efficiency • Time step limited by vibrational frequencies • heavy-atom–hydrogen bond vibration 10-14s (10fs) • 10-20 integration steps per vibrational period: • 0.5 fs time step; 2.000.000 steps for 1 ns • Removal of fast vibrations (constraining): • hydrogen atom bond and angle motion • heavy-atom bond motion • out-of-plane motions (e.g. aromatic groups) • In practice: 1-2 fs time step • 5-7 fs maximum Molecular Simulations & Sampling Techniques
Constraining • to remove degrees of freedom, e.g.: • bond i-j vibrations keep distance i-j constant • angle i-j-k vibrations keep distance i-k constant • Constraint Algorithms • SHAKE • iterative adjustment of lagrange multipliers • LINCS • Taylor expansion of matrix inversion • non-iterative (more stable) • no highly connected constraints • SETTLE • Analytical Solution • for symmetric 3-atom molecules (like water) Molecular Simulations & Sampling Techniques
Improving Performance • Pairwise potential: Fij = − Fji • Potential E(r) ~ 0 at large r : cut-off • Coulomb: ~ 1/r • Lennard-Jones: ~1/r6 • Atoms move little in one step: pair-list • Evaluating r is expensive: r = √|rj−ri| • Large distances change less: twin-range • short-range each step; long range less often • Multiple time-step methods • Many Processor/Compiler/Language specific optimizations: • use of Fortran vs. C • optimize cache performance • arrays of positions, velocities, foces, parameters are very large • compiler optimizations Molecular Simulations & Sampling Techniques
Ignoring Degrees of Freedom • Internal: • bonds, angles → Constraint algorithm • larger time steps • External: • “Solvent” → Langevin dynamics • less (explicit) particles • Inertia & “solvent” → Brownian dynamics • larger time steps Molecular Simulations & Sampling Techniques
Trajectory on Energy Surface Molecular Simulations & Sampling Techniques
Sampling in Conformational Space • Most of the computational time is spent on calculating(local, harmonic) vibrations. DE >> KT Energy vibration Entropy Molecular Simulations & Sampling Techniques
Barriers • Kitao et al. (1998) Proteins 33, 496-517. Molecular Simulations & Sampling Techniques
Psychology of Theorists 100% “In theory, there should be no difference between theory and practice. In practice, however, there is always a difference...“ (Witten and Frank) “For every complex question there is a simple and wrong solution.” (Albert Einstein) “All models are wrong, but some are useful.” (George Box) 0% OPTIMIST SCALE Molecular Simulations & Sampling Techniques
Monte Carlo Sampling • Ergodic hypothesis: • Sampling over time (Molecular Dynamics approach); and • Ensemble averaging (Monte Carlo approach) • Yield the same result: r (r) = < ri(r) >NVE • Detailed Balance condition: p(o) p(on) = p(n) p(no) Molecular Simulations & Sampling Techniques
Metropolis Selection Scheme • Metropolis acceptance rule that satisfies detailed equilibrium:acc(on) = p(n)/p(o) = e-DE/kT if p(n) < (o)acc(on) = 1 if p(n) (o) Metropolis Monte Carlo • Ergodic probability density for configurations around rN e-E/kTp(rN) = ––––––S e-E/kT Molecular Simulations & Sampling Techniques
Search Strategies Molecular Simulations & Sampling Techniques
Leaps Molecular Simulations & Sampling Techniques
Computational Scheme • Readuction of the leaps will lead to classical dynamics • Control parameter: • RMSD • Angle deviation Molecular Simulations & Sampling Techniques
Computational Load: Solvation • Most computational time (>95%) spent on calculating (bulk) water-water interactions Molecular Simulations & Sampling Techniques
Implicit Solvation Molecular Simulations & Sampling Techniques
POPS • Solvent accessible area • fast and accurate area calculation • resolution: • POPS-A (per atom) • POPS-R (per residue) • parametrised on 120000 atoms and 12000 residues • derivable -> MD • Free energy of solvationDGsolvi = areai·si • POPS is implemented in GROMOS96 • parameters 'sigma' from simulations in water: • amino acids in helix, sheet and extended conformation • peptides in helix and sheet conformation Molecular Simulations & Sampling Techniques
POPS server Molecular Simulations & Sampling Techniques
Test molecules: alanine dipeptide Molecular Simulations & Sampling Techniques
Test molecules: BPTI / Y35G-BPTI Classical MD Leap-dynamics Essential dynamics Molecular Simulations & Sampling Techniques
Calmodulin domains • Apparent unfolding temperatures (CD) • C-domain : 315 K (42 ° C) • N-domain : 328 K (55 °C) • LD simulations: • 3 ns • 4 trajectories • 290 K • 325 K • 360 K Molecular Simulations & Sampling Techniques
Snapshots Molecular Simulations & Sampling Techniques
Trajectories Molecular Simulations & Sampling Techniques
Example: Protein & Ligand Dynamics Molecular Simulations & Sampling Techniques
Example: Essential Dynamics Analysis Cyt-P450BM37 x 10ns “free” MD simulations Molecular Simulations & Sampling Techniques
CD Molecular Simulations & Sampling Techniques