1 / 54

Parameterisation of a custom amino-acid, PCA

Parameterisation of a custom amino-acid, PCA. Contents. Force-fields and how they work What kind of interactions do we need to parameterise? Quantum Mechanical calculations Where force-fields obtain basic information CHARMM parameterisation process The goal of developers => follow same goal

drea
Download Presentation

Parameterisation of a custom amino-acid, PCA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parameterisation of a custom amino-acid, PCA

  2. Contents • Force-fields and how they work • What kind of interactions do we need to parameterise? • Quantum Mechanical calculations • Where force-fields obtain basic information • CHARMM parameterisation process • The goal of developers => follow same goal • Worked example of pyroglutamic acid

  3. What this talk will cover • It will: • Teach the necessary background • Advise starting points for parameterisation, by analogy with existing compounds • Follow the process for a CHARMM style strategy • pyroglutamic acid under CHARMM27 • We will be following a published process from developers • It will not: • Teach you how to create a force-field from scratch • Full parameterisation is a long process, with many potential pit-falls • This applies to new atoms and geometries not seen in force-fields • If your molecule falls under this category, it’s best to ask the experts whether someone else has done it NB: read Vanommeasleagh's home page and tutorials: http://dogmans.umaryland.edu/~kenno/#CGenFF

  4. 1: Force fields • All force-fields have a purpose • There are several force-fields, each with their own goals (e.g. following developer’s research interest) • They are thus not fully compatible • All force-fields are made from simple components • They are not really black-boxes • Each terms can be fully understood • Therefore, researchers like us can make force-fields • I will describe CHARMM-ffs from above points

  5. Purpose of force-fields • Force-fields are used to replicate chemical or biological environments. • This means they should produce reasonable results compared to experimental data • Force-fields must also be derived from quantum mechanical (QM) simulations for molecular accuracy • i.e. FFs must approximate behaviour of electrons and atomic bonds • FFs must choose basic functions with which to do this • FF-developers must choose what kind of environment or interaction they will replicate • Thus, FFs must make choices • AMBER targets proteins conformations • OPLS-AA targets organic liquid environments • CHARMM targets hydrous solvent interactions and energies • .˙. Many force-fields can be used for task X, but some will perform better.

  6. Purpose of force-fields Due to these differences: • Force-fields have an optimal range of performances, and limits • E.g. GROMOS works particularly well for studies of lipid behaviour • Results may be surprising if ffs used in exotic conditions • Force-fields should not be mixed with each other • The detailed mechanics are different Analogy: TIP3 water do not melt or boil at expected temperatures – they were never meant to!

  7. CHARMM-CGenFF force field • Basis: Generalforcefield for all organic molecules • Match MM-properties to QM calculations (MP2/6-31G*) • Minimum necessary work, to allow parameters to be shared across many different molecules • Can use for, e.g. drug-binding studies in MD • Overall process for parametrisation • Charges <= Dipole moments, interaction with TIP3 water • Bond and angle values <= QM Equilibrium geometry • Bond and angle constraints <= Vibrational modes • Dihedral constraints <= Potential energy scans • Van der Waals <= Heat of solvation, molecular volume

  8. CHARMM force field for biomolecules • Basis: Energetically accurate set of parameters • Replicate observed experimental quantities specific to proteins, nucleic-acids, etc. • i.e. Aim is for CHARMM proteins/DNA which act exactly like their counterparts in real life • Overall process for parametrisation • Fit parameters like CGenFF. • Modify to fit with relevant experimental observations. This means heat of solvation, solvent/solute interactions, etc. • Operating conditions: This force-field is optimised for room temperature, liquid phase experiments • Similar process: CGenFF can combine with CHARMM

  9. Comparison: AMBER’s protein FF • Basis: To replicate protein behaviour by modelling amino acid conformations accurately • (subtle difference to CHARMM) • Process: • Point charges fit to higher QM calculations (B3LYP/cc-pVTZ/HF/6-31G**) at dielectric constant = 4 • Bonds and angles matched to X-ray crystals and vibrational spectra • Torsions fit to reproduce peptide conformations and phi-psi energy surfaces. • Performs similar to CHARMM by different means • Complete amino acids are parameterised, where as CHARMM uses fragments with known experimental data • More dependent on QM and biochemical observations, less on strict chemical data

  10. Inner-working of CHARMM • CHARMM and CGenFF (from Duan e.t al., 2003) : • Harmonic bonds • Harmonic angles(modified) • Harmonicimpropers • Cosine dihedrals

  11. Inner-working of CHARMM Compositions Observations Intra-molecular terms are very simple in nature They must be fitted to approximate real covalently bonded molecules QM-calculations form the basis of these target data During parameterisation, we won’t usually need to modify vdW terms • Basic interactions between bonded atoms are harmonic potentials • This reduces computation cost by using simple functions and eliminating electrons from the equation • Electrostatics and vdW forces are preserved • They are essential to molecular systems!

  12. QM calculations: Gaussian • CHARMM developers use Gaussian to produce target data • Process documented for Gaussian • However, one can use other QM programs • I’ll explain what calculations are required and how to do them Using Gaussian with GaussView

  13. Gaussian: Scripting • Textual interface provides all input necessary • One file for one simulation • GaussView (GUI) is provided to assist process • The QM-level for parameterisation is MP2/6-31G* • Enough to describe common interactions • Probably not sufficient for certain organo-metal interactions

  14. Gaussian: Scripting Conditions and general purpose of this simulation Title Coordinates: different representations possible Detailed commands, follow-on simulations, etc.

  15. QM calculations required: 1: Equilibrium geometry 2: Vibrational spectra 3: Dihedral energy surfaces NB: CHARMM website provides a fully worked tutorial

  16. CHARMM parametrisation process • Priorities of different parameters • The energetics needs to be replicated ultimately to < 1 kcal/mol • Some parameters have wide-spread effects, other fill in important details • Flowchart • Parameterisation is iterative, and will take time • Illustrate process with pyroglutamic acid

  17. CHARMM parametrisation process • The priority of different parameters: • Charges • Equilibrium bond and angle values • Bonds and angles force constants • Torsions • This order will permeate throughout parameterisation • Each set of parameters depends on everything above • Hence, refinement of parameters follows: • Create/modify data • Refine, check results • repeat.

  18. Flowchart and notations • The rest of this talk follows like so: • 1 – Prepare entries and coordinates • 2 – Optimise charges • 3 – Optimise bonds and angles • 3.1 – equilibrium values by QM geometry • 3.2 – force constants by molecular vibrations • 4 – Optimise dihedrals • 4.1 – Generate Potential Energy Scans • 4.2 – Dihedral constants by matching and chemcial knowledge • 5 – Validation

  19. Individual Processes Set initial topologyand geometry Set initial geometry Calculate QM vib. spec.in CHARMM QM-minim. geometrywith water molecules QM-minim. geometry Modify bond/angleforce const. Set charges and vdW Modify bonds and angles Calculate MM vib. spec.in CHARMM MM-minim. geometrywith water molecules MM-minim. geometry Do theymatch? Do theymatch? Do theymatch? bond/angleforce const. complete charge and vdWfits complete bond and anglefits complete ...etc.

  20. Validation First parametrisation • After the last fit to dihedrals is complete, all outputs need to be verified • Starting again from partial charges and water interaction simulations... • Any changes likely propagate itself downwards through flowchart • Main reason why parameterisation is time-consuming Do charges Do bonds and angles Do dihedrals Validate all data all complete

  21. Worked example: • Pyroglutamic acid • As its name suggests, an amino acid • N-terminal only, cyclisation of glumatic acid or enzymatic activity • Used in protein-ligand simulation with scorpion toxins.

  22. 1.1: prepare topology (idea) • Observe existing molecules in the database • What chemical groups are your molecules? • Cut and paste sections together, borrowing charges and groupings

  23. 1.1: prepare topology (details) • Adapt from existing residues. • PRO as base residue, referred to GLN and backbone values. • Consider its use and clashes with existing parameters • Right now, backbone angles and torsion parameters will be used (bad idea for cyclic molecule) • Change atom-typings to allow modification of important atom bonds and angles • NB: Creating new atom-types are not preferred, as it will be more difficult for future work

  24. 1.1: atom-typings • From existing ff: • neutral Ns in CHARMM • NH2 is primary amine • NH1 is secondary amine • N is tertiary amine • carboxyl Cs in CHARMM • CC/CD are proteins • CE1/CE2 are elementary alkanes • Use NH1/CC typing, no need to modify O and H. • Topology done! Return to ICs later.

  25. 1.1: prepare parameters • Work out required bonds, angles, dihedrals • running CHARMM can help, as it stops with a warning when bonds and angles are missing • Borrow known bond and angle values • PRO provides most of existing • CGenFF has same philosophy in making bonded parameters • 2PDO is a very similar molecule in CGenFF. Use its values for dihedrals as an initial guess. • TIP: Establish what you *need* to parameterise now, and change only them in future edits.

  26. 1.2: Create molecule for Gaussian • The starting geometry will affect results • We grab some experimental coordinates from, e.g. DrugBank or PDB • These coordinates can also be created with some chemical softwares • NB: CHARMM uses IC tables which it will use to generate the residues when coordinates are not given • Either paste by analogy or transfer from minimised geometry

  27. 1.2: QM Minimisation • Gaussian outputs in formats offering more accuracy than pdb • Outputs all bond, angle and dihedral data • file conversions may be necessary for visualisation

  28. 1.3: Create IC table CHARMM manual entry(!) • Each entry in the IC table (see below) lists 4 connected atoms, I, J, K and L,; for a normal IC table entry, the I-J bond length, R(IJ); the I-J-K bond angle, T(IJK); the dihedral angle I-J-K-L, PHI; the bond angle T(JKL); and the K-L bond length, R(KL) are listed. Improper dihedral angles, which are used to keep sp2 atoms planar and sp3 atoms in a tetrahedral geometry, are marked with a star. The center atom of an improper dihedral angle is marked with a star. For an improper dihedral angle entry, the I-K bond length, R(IK); the I-K-J bond angle, T(IKJ); the dihedral angle I-J-K-L , PHI; the J-K-L bond angle T(JKL); and the K-L bond length, R(KL) are listed. The atom entry "-99" indicates an undefined atom. • CHARMM begins with a seed of three atoms, then defines the rest of the molecule with these three • Protein conventions follow backbone N-CA-C • When you create an IC table, remember to base your first entries on these and ‘grow’ the molecule from there

  29. 1.3: Create IC table

  30. 1.3: IC table notes • CHARMM does not need every possible entry in a given IC table • Only the dihedrals are necessary • The command “IC PARAM” will fill in bonds/angles using existing parameters • Will save you a lot of time • Number of entries: about 1 less than the total number of atoms.

  31. 2: Optimise charges • In CHARMM ff, protein charges are not parameterised by QM calculations alone. • Consistent behaviour with other amino-acids means that PCA should obey similar charges, rather than QM data per-se • Obtained charges from analogy are “good enough” when compared with QM data • No unique chemical groups such that charge optimisation is required • This step is thus skipped for PCA • I will show you a mocked example

  32. 2.1: Optimise charges • Begin with charges from Gaussian/analogy • (can read from output)

  33. 2.1: Optimise charges • Run water interaction simulation and also check molecule dipole • i.e. imitate H-bonding interaction with charges • Modify charges to fit both data • NB: MM dipole needs to overestimate QM dipole by 30-50%, since QM data is in vacuum

  34. 3: Optimise bonds and angles • Begin by setting equilibrium values to QM or crystal values • Compare results and modify equilibrium values • Using an IC table by analogy gives the wrong ring conformation to CHARMM • I constructed an IC to start PCA near the other minimum. CHARMM then finds the equatorial conformer • It is important to check that your conformation agrees after dihedras are fitted

  35. 3.1: Bond/angle equilibrium values • CHARMM developers used 0.2 Å and 3° as the upper limit • Can usually do much better • Developers seek to use same parameters to describe many ligands, we do not need to

  36. 3.1: Bond/angle force constants • Method: Comparison of QM and MM vibrational spectra • Analogous residues are good starting points for force constants • However, perfect agreement is impossible • This is due to differences between MM and QM minima, and mixing with torsion parameters • May need to modify again during validation • A “general” agreement (about 10%) is good enough when all parameterisation is finished

  37. 3.1: Vibrational spectra • Vibrations can be collected into components • change in dipole determines IR absorption • Components are defined by motions of collective parts • bond stretches and angle bending • rocking, scissoring, wagging motions • Tip: revise IR and Raman spectroscopy dihedrals, out-of-plane motion N-C stretch C=O stretch C-H stretch N-H,O-H stretch

  38. 3.1: Vibrational notation • In CHARMM, you need to convert motions of atoms into bonds, angles and dihedrals (internal coordinates, IC) • Forms basis set of all the degrees of freedoms • # Vibrations = DoG = 3N-6 • CHARMM then uses these to fit vibrational spetra • Then you need to convert these ICs into vibrational modes • Read Pulayet. al. (1979) NB: I wrote a relatively simple tutorial on the CHARMM forums to run users through the Pulay conversion for water and propane.

  39. MM QM

  40. 3.1: Bond/angle force constants • Visualising these vibrations with GaussView and others will help you identify important vibrations

  41. NB: Remember there is a distinction between fitting to QM calculations, and fitting to experimental spectra

  42. 4: Optimise dihedrals • Potential Energy Scans involve: fixing a dihedral at discrete points, minimising the rest of the geometry, and calculating absolute energy. • Determines conformational preferences of residues, especially important for packing

  43. 4: Optimise dihedrals • PCA with initial dihedrals favour the opposite ring conformer (top) • One will need to work out which parameter affect the rotations you need • After a series of fits and rationales for given parameters, a closer agreement can be obtained (bottom)

  44. Somerationales for the PCA case: • Fit only dihedrals that contains re-typed atoms • This includes NH1 and CC • Carboxyl backbone prefers equatorial over axial orientation w.r.t. ring • Fit a single 1-fold or 2-fold dihedral to CD-N-CA-C to express this • Keep all the 2PDO dihedrals as is, except where necessary • So, planar amide retains 2-fold only due to symmetry • Ring-dihedrals contain only 3-fold • Modify numbers to produce correct energy surface • The rest is heuristic searching

  45. Pointers for a good fit: • Single parameters of 1-fold,2-fold, 3-fold, 6-fold. • Accuracy to about 0.2 kcal/mol • Attention to barrier heights and relative minima positions

  46. Using multiple dihedral parameters and no restrictions on phase, one can achieve a fit like this. • The energy surface is very close to QM. However, the parameters used are arguably unphysical

More Related