160 likes | 371 Views
Steps towards an Ensemble-Based Force Field Fitting Procedure…. Dragos Horvath , Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille. Goal…. To calibrate an empirical molecular force field for use in conformational sampling and docking: Generally applicable to proteins, sugars, organic ligands
E N D
Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille
Goal… • To calibrate an empirical molecular force field for use in conformational sampling and docking: • Generally applicable to proteins, sugars, organic ligands • Full atom simulations, no large protein folding • Tailor-made for use with torsional degrees of freedom only! • Continuum model for solvent effects! • Consistent, in the sense that docking affinities & folding propensities should be directly linked to computed force field energies of sampled ensembles, • no a posteriori rescoring of docking poses! Docking is just simultaneous conformational sampling of several molecules!
Generation of new offspring : initial population intermediate population next generation Crossover : q11 q11 q12 q12 q13 q13 ... ... q1n q1n q11 q12 q13 ... q1n child1 : q1 q2 … qi q’i+1 … q ’n q21 q21 q22 q22 q23 q23 ... ... q2n q2n q21 q22 q23 ... q2n q31 q31 q32 q32 q33 q33 ... ... q3n q3n q31 q32 q33 ... q3n child2 : q ’1 q ’2 … q ’i qi+1 … qn q41 q41 q42 q42 q43 q43 ... ... q4n q4n q41 q42 q43 ... q4n q51 q52 q53 ... q5n sorted sorted Mutation : parent1 : q1 q2 … qi qi+1 … qn q61 q62 q63 ... q6n q71 q72 q73 ... q7n parent2 : q ’1 q ’2 … q ’i q ’i+1 … q ’n random q81 q82 q83 ... q8n Wild type : q1 q2 … qi qi+1 … qn energies q1 q2 q3 … ... qn mutant : q1 q2 … q ’i qi+1 … qn The Prerequisite: an Exhaustive Conformational Sampling Tool! • Based on a Genetic Algorithm, coding conformers as "chromosomes" in which each locus stands for a torsional angle value. • The In Silico Darwinian Evolution, leading to fitter and fitter (lower energy) conformers, was enhanced by • hybridization with various optimization heuristics • Fine-tuning of the parameters controlling the evolutionary strategy
Knowledge-based bias: favoring locally stable torsions… "Traditionalism": favoring torsion valuesseen in previously visited samples Hybrid Heuristics: (1)-Targeted torsion angle choice! • Biasing the probabilities to draw a given value for a given angle (according to a temperature parameter):
Hybrid Heuristics: (2)-Directed Mutants or Explorers • Torsional angle driving Adding aconstraint term, Final relaxation towardslocal minimum Gradient optimizationin this new landscape Evolution stuck in local minima, no mutationwould help "Explorer" launched in parallel in ordernot to halt the evolution process
Other Hybrid Heuristics - Automated Fragment Presampling, Taboo Search • Fragmentation: Sampling of energetically permitted geometries of fragments in presence of a buffer zone • allows the automated definition of "rotamer libraries" out of which to pick geometries during global sampling! • Taboo Search & Intrapopulation diversity control: • Discarding chromosomes that are too similar to fitter conformers or to previously visited geometries
Search for Optimal Sampling Setups in the Strategy Parameter Space…
Meta-algorithm defines parameter setup Meta-GA picks next set of configurations Run 1 Run 2 … Run n « Taboos » « Tradition » yes Explorer News ?? Postprocessing… Global Base of Diverse Conformers no 3-fold repeat Base of diverse conformers [sampled at current setup] µ-Fitness!! GAME OVER Sampling Engine Overview
Conformational sampling with an optimally tuned GA is (reproducibly !) more efficient than a randomly parameterized simulation Optimizedparameters Randomparameters
Impact of the hybrid heuristics on the sampling of cyclodextrine… Default NoExploring No Taboos Flat distribution Preference for locally stable torsions 35 30 25 20 Nr. of diverse conformers within +20 kcal . from best minimum 15 10 5 0 1 10 100 Deepest Energy well (kcal)
Unfortunately, small molecules showing significant structuring in water (due to weak non-covalent interactions) are rare… The "Trp cage" peptide 1L2Y (helix & turn, 20 AA) The "Trp zipper" peptide 1LE1 (b-sheet, 13 AA) Designed minimalist b-sheet peptide 1UAO (10 AA) The WW domain of PIN 1 (34 AA, mostly b-sheet) Conformationally Restrained Helical peptide (CRH) with a chemically engineered helix inducer group (21 AA) Cyclodextrine (with "opened" rings!) Protein-ligand complexes to be used as soon as the docking module is developed ! Wanted: *structured* compounds with ~100 torsional degrees of freedom!
Force Fields: What's wrong with existing ones? • Heisenberg's Frustration Principle applies: • (FF Inaccuracy)X(Chance of "Missing Parameters Error") >> ħ • Most were fitted with respect to few key points of the energy-geometry landscape, around which molecular dynamics simulations were supposed to gravitate… • … but sampling methods that facilitate barrier crossings may discover deeper artefactual minima elsewhere! • Ignoring valence angle flexibility requires some additional "fuzziness" of force field terms, to "accommodate" imprecise interatomic distances…
‘Smoothing’ distance dij Effective interatomic distance d0ij Considered Force Field terms • Customized CVFF force field, employing: • a 10 Å cutoff (with a termination function) • a smoothing procedure of interatomic clash contributions • a continuum solvent model
For each training molecule Yes, for the first time! NO! Run GA-driven Exhaustive Sampler Locally explore neighborhood of experimental geometry All DG <0? RMS deviation from native Yes, reconfirmed! Add all sampled conformers to Data Base & calculate RMS Deviation from "native" geometry OK! Recalculate energies of stored conformers according to current FF setup Calculate Folding DG according to chosen RMS radius The Force Field Fitting Procedure… • Distance-dependent dielectric constant • Weighing factor of the desolvation penalty • Weighing factor of the hydrophobic contacts • Weighing factor of repulsive van der Waals • Attractive & repulsive van der Waals coefficients of the following type: • 'co' (carbonyl C), 'o' (ether-type O), 'h' (aliphatic H), 'cp' (aromatic C), 'oc' (carbonyl O) Install a NEW FF parameter configuration
Tryptophane cage (1L2Y) Conformationally restrained helical peptide Cyclodextrine (open rings) Status Quo – after eight iterations in force field parameter space… • Compounds for which experimental confor-mations are being sampled and ranked among the energetically most stable: • Compounds for which correctly folded conformers were sampled, but misfolded conformers of lower energy were also found! • Molecules for which the correctly folded conformers were never sampled • the WW domain of PIN-1 (34 residues) • the 1LE1 ‘Tryptophane Zipper’ mini-protein (13 residues)
Conclusions… • This is a coherent approach to simultaneously evolve a conformational sampling and docking engine, together with its underlying force field • Both the ability to find the minima and the quality of the energy landscape are paramount in ensuring that the herein defined measures of free energy will be physically relevant… • Will the resulting molecular force field be more "sampling-friendly" (with funnel-like landscapes?) • At this point, it is unclear how quickly – if ever – it will converge, but it is well suited for GRID computations (deployment in progress). • A genetic algorithm reproducibly finding a significant low-energy representative for each populated energy minimum cannot be envisaged without help from other minimum search heuristics… TTHANKSHTHANKSATHANKSNTHANKSKTHANKSS