450 likes | 609 Views
Molecular dynamics and applications to amyloidogenic sequences. Nurit Haspel, David Zanuy, Ruth Nussinov (in cooperation with Ehud Gazit’s group). Contents. Molecular dynamics: goals, applications and basic principles: Newton’s equations of motion. Energy conservation and equations.
E N D
Molecular dynamics and applications to amyloidogenic sequences Nurit Haspel, David Zanuy, Ruth Nussinov (in cooperation with Ehud Gazit’s group)
Contents • Molecular dynamics: goals, applications and basic principles: • Newton’s equations of motion. • Energy conservation and equations. • The force field. • Solvation models. • Periodic boundary conditions. • A molecular dynamic protocol • Energy minimization.
Contents (cont.) • MD protocol (cont) • Assignment of initial velocities. • Equilibration. • Methods of integration. • Case study: Human calcitonin hormone • Basic background. • Simulated models. • Initial results and future work plans.
The goal of MD • Predicting the structure and energy of molecular systems (in our case – short peptide structures). • Simulating the behavior of the molecules in the solution (by solving the energy equations at every time interval). • Trying to find a model that explains the behavior of the system.
Applications of MD • Sampling the conformational space over time. Important for ligand docking, for example. • Determine equilibrium averages, structural and motional properties of the system. • Study the time development of the system. • Today, most (if not all) biomolecular structures obtained by X-ray crystallography or NMR are MD refined.
Types of simulated systems • Peptidic systems • Micelle formation • Nucleotides • Small molecules • Ligand docking • … • Note: Each type of system has its own unique parameters and equations.
The basic principle Solving the classical mechanics equations (Newton’s equations) over the pairs of atom distances, angles, dihedrals, VdW interactions and electrostatics in small time intervals (other parameters can be added). classical equations are usually sufficient for large scale systems. Quantum mechanical modifications are extremely costly and are used only on small scale system or where more accuracy is needed.
Newton’s mechanical equation (based on Newton’s second law) F V2 V1 F = Ma = M*(dv/dt) = M*(d2r/dt2) Or, with a small enough time interval Δt: ΔV = (F/M)* Δt → V2 = V1+(F/M)* Δt
Newton equations (cont.) The new position, r2 is determined by the old position, r1 and the velocity v2 over time Δt (which should be very small!). The above equation describes the changes in the positions of the atoms over time.
The process of MD The simulation is the numerical integration of the Newton equations over time. Positions and velocities at time t Positions and velocities at time t+dt. Positions + velocities = trajectory.
The connection between force and energy F=-dU/dr →U=-∫Fdr=-1/2*Mv2 U = the energy (scalar). r = the position vector.
Conservation of energy ½*ΣMiVi2 + ΣEpot,i=const The potential energy is taken from the force field parameters.
The potential energy equations – bonded interactions U(R) = • bond • angle • dihedral
The potential energy equations (cont., non-bonded interactions) • Van der Waals • electrostatic Etc… The energy parameters are defined in the force field
The force field definition All the equations and the adjusted parameters that allow to describe quantitatively the energy of the chemical system. Note, that mixing equations and parameters from different systems always results in errors! Force field examples: FF2, FF3, Sybyl, charmm etc.
Solvation models • No solvent – constant dielectric. • Continuum – referring to the solvent as a bulk. No explicit representation of atoms (saving time). • Explicit – representing each water molecule explicitly (accurate, but expensive). • Mixed – mixing two models (for example: explicit + continuum. To save time).
Periodic boundary conditions Problem: Only a small number of molecules can be simulated and the molecules at the surface experience different forces than those at the inner side. • The simulation box is replicated infinitely in three dimensions (to integrate the boundaries of the box). • When the molecule moves, the images move in the same fashion. • The assumption is that the behavior of the infinitely replicated box is the same as a macroscopic system.
A sample MD protocol • Read the force fields data and parameters. • Read the coordinates and the solvent molecules. • Slightly minimize the coordinates (the created model may contain collisions), a few SD steps followed by some ABNR steps. • Warm to the desired temperature (assign initial velocities). • Equilibrate the system. • Start the dynamics and save the trajectories every 1ps (trajectory=the collection of structures at any given time step).
Why is minimization required? • Most of the coordinates are obtained using X-ray diffraction or NMR. • Those methods do not map the hydrogen atoms of the system. • Those are added later using modeling programs (such as insight), which are not 100% accurate. • Minimization is therefore required to resolve the clashes that may “blow up” the energy function.
Common minimization protocols • First order algorithms: • Steepest descent • Conjugated gradient • Second order algorithms: • Newton-Raphson • Adopted basis Newton Raphson (ABNR)
Steepest descent This is the simplest minimization method: • The first directional derivative (gradient) of the potential is calculated and displacement is added to every coordinate in the opposite direction (the direction of the force). • The step is increased if the new conformation has a lower energy. • Advantages: Simple and fast. • Disadvantages: Inaccurate, usually does not converge.
Conjugated gradient • Uses first derivative information + information from previous steps – the weighted average of the current gradient and the previous step direction. • The weight factor is calculated from the ratio of the previous and current steps. • This method converges much better than SD.
Newton-Raphson algorithm • Uses both first derivative (slope) and second (curvature) information. • In the one-dimensional case: • In the multi-dimensional case – much more complicated (calculates the inverse of a hessian [curvature] matrix at each step) • Advantage: Accurate and converges well. • Disadvantage: Computationally expensive, for convergence, should start near a minimum.
Adopted basis Newton Raphson (ABNR) • An adaptation of the NR method that is especially suitable for large systems. • Instead of using a full matrix, it uses a basis that represents the subspace in which the system made the most progress in the past. • Advantage: Second derivative information, convergence, faster than the regular NR method. • Disadvantages: Still quite expensive, less accurate than NR.
Assignment of initial velocities • At the beginning the only information available is the desired temperature. Initial velocities are assigned randomly according to the Maxwell-Bolzmann distribution: Pv - the probability of finding a molecule with velocity between v and dv. Note that: 1. the velocity has x,y,z components. 2. The velocities exhibit a gaussian distribution
Bond and angle constraints (SHAKE algorithm) Constrain some bond lengths and/or angles to fixed values using a restraining force Gi. • Solve the equations once with no constraint force. • Determine the magnitude of the force (using lagrange multipliers) and correct the positions accordingly. • Iteratively adjust the positions of the atoms until the constraints are satisfied.
Equilibrating the system Velocity distribution may change during simulation, especially if the system is far from equilibrium. • Perform a simulation, scaling the velocities occasionally to reach the desired temperature. • The system is at equilibrium if: • The quantities fluctuate around an average value. • The average remains constant over time.
The verlet integration method Taylor expansion about r(t): Combining the equation results in: Which is velocity independent. The error is of order δt4 (the next expression of the series)
The verlet method (cont.) The velocities can be calculated using the derivation formula: Here the error is of order δt2 Note – the time interval is in the order of 1fs. (10-15s)
The verlet algorithm • Start with r(t) and r(t-δt) • Calculate a(t) from the Newton equation: a(t) = fi(t)/mi . • Calculate r(t+δt) according to the aforementioned equation. • Calculate v(t). • Replace r(t-δt) with r and r with r(t+δt). • Repeat as desired.
Amyloid fibril formation • Associated with a large number of degenerative diseases such as Alzheimer’s, Parkinson’s etc. • Associated with a structural change in the protein structure, resulting in the formation of stable fibrils. • The fibrils are richer in β-sheets (although their tertiary arrangements are usually undetermined). • Amyloid forming proteins do not share sequence homology, but the fibrillar structures exhibit similar physicochemical and structural characteristics.
The human calcitonin (hCT) • A 32 amino acid polypeptide hormone, produced by the C-cells of the thyroid and involved in calcium homeostasis. • Fibrillation of hCT was found to be associated with carcinoma of the thyroid. • Synthetic hCT can form amyloid fibrils in vitro with similar morphology to the deposits found in the thyroid. • The in vitro process is affected by the pH of the system.
The structure of hCT • In monomeric state, hCT has little ordered secondary structure in room temperature. • Fibrillated hCT have both helical and sheet components. • In DMSO/H2O a short double stranded anti-parallel β-sheet is formed in the region of residues 16-21. • Previous research indicated a critical role to residues 18-19.
The sequence of hCT -+ NH2-CGNLSTCMLGTYQDFNKFHTFPQTAIGVGAP-COOH
Experimental data regarding the fibril forming region • The DFNKF area was found to form fibrils rich in anti-parallel β-sheets. • The spectrum observed with the DFNK tetrapeptide is less typical of β-sheets, but may be interpreted as such. • The FNKF tetrapeptide exhibits a spectrum that is typical of a non-ordered structure. • The DFN tripeptide seems to be a mixture of β-sheet and non-ordered structure.
The effect of F→A mutation The DANKA mutation does not exhibit a typical spectrum of the β-sheet structure, although they exhibit a certain degree of order. This implies on the effect of the Phe aromatic residues in the fibrillation process.
Tested models • Combinations of parallel/anti parallel within sheet and between sheets. So far – about 20 models. • Each model is simulated for 4ns. (each such simulation takes about 5 days on a powerful cluster…). • The tested parameters for model stability: distance within/between sheets, aromatic interactions, HB contact conservation etc.
Initial results (trajectory analysis) A model that’s totally unstable: Before: After:
Future work plans • Test mutations once we focus on the correct model. • Make more analyses and find out what causes the fibril formation (suspicion: The aromatic ring π-stacking, salt bridges between the oppositely charged residues D and K) • …???