290 likes | 402 Views
HIGH RESOLUTION LATTICE MODELS OF PROTEINS: DESIGN & APPLICATIONS. Andrzej Kolinski LABORATORY OF THEORY OF BIOPOLYMERS WARSAW UNIVERSITY http://www.biocomp.chem.uw.edu.pl Structure and Function of Biomolecules, Bedlewo, May 12-15, 2004. WHY REDUCED MODELS?.
E N D
HIGH RESOLUTION LATTICE MODELS OF PROTEINS: DESIGN & APPLICATIONS Andrzej Kolinski LABORATORY OF THEORY OF BIOPOLYMERS WARSAW UNIVERSITY http://www.biocomp.chem.uw.edu.pl Structure and Function of Biomolecules, Bedlewo, May 12-15, 2004
WHY REDUCED MODELS? • Classical Molecular Mechanics study of the large scale conformational rearrangements of biomolecules are still impractical (proteins fold in a time frame of 0.001s to 100s - “long” MD simulations cover 100 nanoseconds). • The number of degrees of freedom treated in an explicit way needs to be reduced and the energy landscape smoothened. • Knowledge-based force fields of reduced models seem to have frequently a higher predictive power than the all-atom potentials of the Molecular Mechanics. • We know about 1000 times more protein sequences than protein structures (ca. 30M against ca. 30k). This gap increases.
OUTLINE • Reduced protein models of an intermediate and high resolution (representation, sampling and force field) • Ab initio folding (an illustration) • Loops (or fragments) modeling using various reduced representations: SICHO, CABS and REFINER models. Comparison with standard modeling tools: MODELLER and SWISS-MODEL • Comparative modeling starting from multiple threading alignments
SICHO, CABS and REFINER All models use knowledge-based statistical potentials derived via an analysis of structural regularities seen in the solved structures of globular proteins
Sampling of the conformational space of the SICHO and CABS models -Single residue moves -Two-residue moves -Three-residue moves -Small distance (rigid body) moves of a randomly selected fragment of the model chain -Reptation type moves
INTERACTION SCHEME • Generic “protein-like” biases • Statistical potentials for short-range conformational propensities • Model of main chain hydrogen bonds • Pairwise interactions between united atoms (including orientation- and secondary structure dependent potentials)
Generic (sequence independent) chain stiffness - regular secondary structure propensities
Generic (sequence independent) chain stiffness 1 B1 = f×eg for: (vi-1 • vi+3)<0 B2 = -f×eg-g×eg for: | ri+4 –ri |< 7.0 Å and “right handed” twist or: | ri+4 –ri |>11.0 Å and b-type geometry
Generic (sequence independent) chain stiffness B4 = h×eg for: (ri+5 –ri ) • (ri+10 –ri+5 ) < 0 and (ri+15 –r10 ) • (ri+5 –ri) >0 i.e., penalty for a too crumpled main chain conformations 1 For known or strongly predicted secondary structure fragments an additional bias towards proper values of the medium-range distances along the chain could be superimposed
Short-range conformational propensities E13(ri+2,i , Ai, Ai+2) E14(r*i+3,i , Ai+1, Ai+2 E15(ri+4,i , Ai+1, Ai+3) Note: the reduced backbone geometry correlates better with secondary structure than the phi-psi angles E/kT ~ -ln(nk,A1,A2/<nk,Ai,Aj>) < > average over the database -10 -1 0 1 10 _______________________________________________________________________________________________________ ALA ALA -0.25 -0.45 -0.39 0.73 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 -1.12 -2.55 0.44 0.56 0.25 0.76 0.51 VAL THR -1.71 -1.83 0.06 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 0.11 -1.51 0.56 0.56 0.44 -0.57 -0.75 _______________________________________________________________________________________________________ Left-handed beta unlike or prohibited Alpha Right-handed beta
Model of the main chain hydrogen bonds Hydrogen bonds cause specific spatial arrangement of the a-trace vectors and the a-carbon united atoms • The united atoms i and j are “hydrogen bonded” when: • at least one of the vectors h points into the vicinity of the a-carbon i or j • - vectors h are “almost” parallel (or antiparallel) • (bi *bj) >0 (“roughly” parallel) • The strength of the hydrogen bond is moderated by a cooperative component dependent on the distance between the corresponding centers of the Ca-Ca virtual bonds (minimum of the potential at 4.25 Å) Additional rules: No hydrogen bonds between pairs assigned as (HE) and (HH for |i-j|>3) The Ca-based model of hydrogen bonds correlates very well with the real hydrogen bonds. When “translating” the indices need to be properly shifted (by +/- 1) depending on type of secondary structure
Pairwise interactions (Ca, Cb, Side Groups) • Hard-core excluded volume for Ca-Ca, Cb-Cb and Ca-Cb pairs (the cut-off distances are amino acid independent). • Soft core excluded volume for interactions with the side groups. • Pairwise potentials for side groups derived from a statistical analysis of known protein structures. • Two side groups are assumed to be “in contact” when any pair of their heavy atoms is “in contact” (4.5 Åcut-off) – the average distance between the centers of mass are then taken as a contact distance for a pair of side groups. • Side group pairwise potentials are “context” dependent (mutual orientation, conformation of the main chain)
Pairwise interactions of the side groups Between centers of mass (all heavy atoms of a side group + Ca). Cut-off distances pairwise dependent (not additive, account for some packing details). Square-well shape of the potential (for charged residues a tail added). Soft (however relatively large) excluded volume potential – the height is amino acid independent. For a given pair of amino acids the strength of interactions and the cut-off distances depend on mutual orientation of the interacting side groups and on the local geometry of the main chain.
CONTEXT-DEPENDENT STATISTICAL POTENTIALS Three types of the mutual orientations of the side groups: A-antiparallel, M-intermediate, P-parallel Two types of the main chain conformations: C- compact and E-extended Derived pairwise contact potentials from the statistics of the numbers of parallel, antiparllel and semi-orthogonal contacts for a given residue type and two types of the main chain conformations.
NEW STATISTICAL POTENTIALS (AN EXAMPLE) LYS-GLU POTENTIAL P M A CC -0.9 -0.4 0.9 EE -1.1 -0.4 0.6 CE -0.2 0.1 0.8 EC -0.2 0.0 0.8 GAPLESS THREADING %NATIVE Z-score QUASI 86 % 6.72 QUASI3 94 % 7.84 QUASI3S 97 % 9.96 When tested on a large set of decoys the orientation and backbone conformation dependent potentials QUASI3S exhibits better correlation between energy and RMSD from native than the more “generic” potentials
Ab initio folding • “Pure” ab initio (with only statistical potentials) protein folding and macromolecular assembly (results for the SICHO model)
LOOP MODELING – STRUCTURE COMPLETION • Fixed template (and an “ideal” alignment) from PDB with removed fragments of their native structure • Random starting conformation of the loops (non-entangled) • Loop optimization using SICHO, CABS and REFINER (sampling via Replica Exchange Monte Carlo) • The lowest energy structure taken for a comparison with MODELLER and SWISS-MODEL (automatic version) • No human intervention during the modeling procedures
EXAMPLES (a-SICHO, b-CABS, c-REFINER, d-MODELLER) Gray – template Green – native fragment or loop removed from the PDB structure Red – Modeled fragment
EXAMPLES (a-SICHO, b-CABS, c-REFINER, d-MODELLER) • Green – native fragment or loop removed from the PDB structure • Red – Modeled fragment • Gray – template
COMPARATIVE MODELING WITH MULTIPLE TEMPLATES • Highest score templates detected by threading procedures are used to extract the distance restraints • “Soft” implementation of the restraints in the CABS algorithm (from the top-four templates –when available) • Sampling via Replica Exchange Monte Carlo • Almost always a single cluster of structures is obtained and its centroid is taken as a final model
SUMMARY OF COMPARATIVE MODELING Frequently the models are closer to the native structure than to any of the templates
CONCLUSIONS • Algorithms employing reduced representation of the protein conformational space are now mature and efficient tools for protein modeling • Applications: - ab initio structure prediction - comparative modeling (also multitemplate) - structure assembly from sparse experimental data - dynamics and thermodynamics of proteins, prions - flexible docking, macromolecular assemblies • Tools exist for the all-atom reconstruction of the reduced models. (See: NIH Research Resources for Multiscale Modeling Tools in Structural Biology hhtp://mmtsb.scripps.edu)
Warsaw University Poland Michal Boniecki Dominik Gront Sebastian Kmiecik Piotr Klein Piotr Pokarowski Piotr Rotkiewicz Andrzej Kolinski SUNY at Buffalo (NY) Piotr Rotkiewicz Jeffrey Skolnick Acknowledgement More info: http://www.biocomp.chem.uw.edu.pl