510 likes | 741 Views
CZ5226: Advanced Bioinformatics Lecture 8: Molecular Modeling Method Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg http://xin.cz3.nus.edu.sg Room 07-24, level 7, SOC1, National University of Singapore. References on Modeling of MHC Binding Peptide.
E N D
CZ5226: Advanced BioinformaticsLecture 8: Molecular Modeling Method Prof. Chen Yu ZongTel: 6874-6877Email: csccyz@nus.edu.sghttp://xin.cz3.nus.edu.sgRoom 07-24, level 7, SOC1, National University of Singapore
References on Modeling of MHC Binding Peptide • Protein Sci. 2004 Sep;13(9):2523-32 • J Am Chem Soc. 2004 Jul 14;126(27):8515-28 • Proteins. 2004 Feb 15;54(3):534-56 • Hum Immunol. 2003 Dec;64(12):1123-43 • Immunity. 2003 Oct;19(4):595-606 • Mol Med. 2003 Sep-Dec;9(9-12):220-5 • Nature. 2002 Aug 1;418(6897):552-6 • Eur J Immunol. 2002 Aug;32(8):2105-16 • Immunol Cell Biol. 2002 Jun;80(3):286-99 • Ann N Y Acad Sci. 2002 Apr;958:317-20 • Mol Immunol. 2002 May;38(14):1039-49 • J Pept Res. 2002 Mar;59(3):115-22 • Mol Immunol. 2002 Feb;38(9):681-7 • Tissue Antigens. 2002 Feb;59(2):101-12 • J Comput Aided Mol Des. 2001 Jun;15(6):573-86 • J Mol Biol. 2000 Jul 28;300(5):1205-35 • J Comput Aided Mol Des. 2000 Jan;14(1):71-82 • J Comput Aided Mol Des. 2000 Jan;14(1):53-69 • J Mol Graph Model. 1999 Jun-Aug;17(3-4):180-6, 217
T Complex Receptor Ligand What is Docking? • Given two molecules find their correct association: = +
General Protein–Ligand Binding • Ligand - Molecule that binds with a protein - DNA, drug lead compounds, etc. • Protein active site(s) - Allosteric binding - Competitive binding • Function of binding interaction - Natural and artificial
What is Protein-Ligand Docking? • Definition: Computationally predict the structures of protein-ligand complexes from their conformations and orientations. The orientation that maximizes the interaction reveals the most accurate structure of the complex. • Importance of complexes - structure -> function
Example: HIV-1 Protease Active Site (Aspartyl groups)
Docking Strategy PDB files Surface Representation Patch Detection Matching Patches Scoring & Filtering Candidate complexes
Issues Involved in Docking • Protein Structure and Active Site - assumed knowledge (PDBs, etc.) - PROCAT database: 3d enzyme active site templates • Ligand Structure - pharmacophore (base fragment) in potential drug compound - well known groups • Rigid vs. Flexible - solution or vacum - structure
Algorithmic Approaches to Docking • Qualitative • Geometric • shape complementarity and fitting • Quantitative • Energy Calculations • determine global minimum energy • free energy measure • Hybrid • Geometric and energy complementarity • 2 phase process: soft and hard docking
. Design of HIV-1 Protease Inhibitor
. Design of HIV-1 Protease Inhibitor
. Design of HIV-1 Protease Inhibitor
. Design of HIV-1 Protease Inhibitor
Scoring in Ligand-Protein Docking Potential Energy Description:
Preprocessing • Determine internal representation - convert coordinates of both molecules from PDB files - e.g. Michael Connolly’s MS program (www.biohedron.com) - dot surface - AutoGrid - 3d grid (array) with discrete values - often used in rigid docking
Some techniques • Surface representation, that efficiently represents the docking surface and identifies the regions of interest (cavities and protrusions) • Connolly surface • Lenhoff technique • Kuntz et al. Clustered-Spheres • Alpha shapes • Surface matching that matches surfaces to optimize a binding score: • Geometric Hashing
Dense MS surface (Connolly) Sparse surface (Shuo Lin et al.) Surface Representation
Surface Representation • Each atomic sphere is giventhe van der Waals radius of the atom • Rolling a Probe Sphere over the Van der Waals Surface leads to the Solvent Reentrant Surface or Connolly surface
Lenhoff technique • Computes a “complementary” surface for the receptor instead of the Connolly surface, i.e. computes possible positions for the atom centers of the ligand Atom centers of the ligand van der Waals surface
Kuntz et al. Clustered-Spheres • Uses clustered-spheres to identify cavities on the receptor andprotrusions on the ligand • Compute a sphere for every pair of surface points, i and j, withthe sphere center on the normal from point i • Regions where many spheres overlap are either cavities (on thereceptor) or protrusions (on the ligand) j i
Alpha Shapes • Formalizes the idea of “shape” • In 2D an “edge” between two points is “alpha-exposed” if there exists a circle of radius alpha such that the two points lie on the surface of the circle and the circle contains no other points from the point set
Alpha Shapes: Example Alpha=infinity Alpha=3.0 Å
Surface Matching • Find the transformation (rotation + translation) that will maximize the number of matching surface points from the receptor and the ligand First satisfy steric constraints… • Find the best fit of the receptor and ligand using only geometrical constraints … then use energy calculations to refine the docking • Selet the fit that has the minimum energy
. Design of HIV-1 Protease Inhibitor
Docking Programs More information in: http://www.bmm.icnet.uk/~smithgr/soft.html The programs are: • DOCK (I. D. Kuntz, UCSF) • AutoDOCK (Arthur Olson, The Scripps Research Institute) • RosettaDOCK (Baker, Washington Univ., Gray, Johns Hopkins Univ.) • INVDOCK (Y. Z. Chen, NUS)
DOCK as an Example DOCK works in 5 steps: • Step 1 Start with crystal coordinates of target receptor • Step 2 Generate molecular surface for receptor • Step 3 Generate spheres to fill the active site of the receptor: The spheres become potential locations for ligand atoms • Step 4 Matching: Sphere centers are then matched to the ligand atoms, to determine possible orientations for the ligand • Step 5 Scoring: Find the top scoring orientation
DOCK as an Example 1 2 • HIV-1 protease is • the target receptor • Aspartyl groups are • its active side 3
DOCK as an Example 4 5 • Three scoring schemes: Shape scoring, Electrostatic scoring • and Force-field scoring • Image 5 is a comparison of the top scoring orientation of the • molecule thioketal with the orientation found in the crystal • structure
The DOCK Algorithm Two steps in rigid ligand mode: Orienting the putative ligand in the site Guided by matching distances, between pre-defined site points on the target to interatomic distances of the ligand.The RT matrix is used for the transform of the ligand. Scoring the resulting orientation Each orientation is scored for each quality fit. The process is repeated a user-defined number of orientations or maximum orientations
Define the target binding site points. • Match the distances. • Calculate the transformation matrix for the orientation. • Dock the molecule. • Score the fit.
Site Points Generation in DOCK • Program SPHGEN identifies the active site, and other sites of interest. • Each invagination is characterized by a set of overlapping spheres. • For receptors, a negative image of the surface invaginations is created; • For a ligand, the program creates a positive image of the entire molecule.
The Matching Can be directed by 2 additional features: • Chemical matching - labeling the site points such that only particular atom types are allowed to be matched to them. • Critical cluster - subsets of interest can be defined as critical clusters, so that at least one member of them will be part of any accepted ligand “match”. Increase in efficiency and speed due to elimination of potentially less promising orientations!
Other Docking programs AutoDock • AutoDock was designed to dock flexible ligands into receptor binding sites • The strongest feature of AutoDock is the range of powerful optimization algorithms available RosettaDOCK • It models physical forces and creates a very large number of decoys • It uses degeneracy after clustering as a final criterion in decoy selection INVDOCK • Docking strategy and algorithm similar to DOCK, but with the capability of finding the receptors to which a molecule can bind to.
Conformational Ensembles Docking Observations: • Generating an orientation of a ligand in a binding site may be separated from calculating a conformation of the ligand in that particular orientation. • Multiple conformations of a given ligand usually have some portion in common (internally rigid atoms such as ring systems), and therefore, contain redundancies.
Conformational Ensemble Docking • Conformational ensembles are generated by overlaying all conformations of a given molecule onto its largest rigid fragment. • Only atoms within this largest rigid fragment are used during the distance matching step. The RT matrix is defined. • Each of the conformers is oriented into the site and scored. The score measures steric and electrostatic complementarity. • One matching steps - all the conformers are docked and scored in the selected orientation.
Advantages of Conformational Ensemble Docking Speed increase due to: • One matching step for all the conformers. • The largest rigid fragment usually has fewer atoms (less potential matches are examined).
Disadvantages of Conformational Ensemble Docking • Loss of information when the orientations are guided only by a subset of the atoms in molecule. Orientations may be missed because potential distance matches from non-rigid portions of the molecule are not considered. • The ensemble method will fail for ligands that lack internally rigid atoms. • The use of chemical matching and critical clusters is limited.
Results of Docking Studies The docked (blue) and crystal (yellow) structure of ligands in some PDB ligand-protein complexes. The PDB Id of each structure is shown.
Dataset and Testing Results Protein-Proteincases from protein-protein docking benchmark [6]: Enzyme-inhibitor – 22 cases Antibody-antigen – 16 cases Protein-DNAdocking: 2 unbound-bound cases Protein-drugdocking: tens of bound cases (Estrogen receptor, HIV protease, COX) Performance:Several minutes for large protein molecules and seconds for small drug molecules on standard PC computer. Estrogen receptor Estradiol molecule from complex docking solution DNA endonuclease Estrogen receptor with estradiol (1A52). RMSD 0.9Å, rank 1, running time: 11 seconds docking solution Endonuclease I-PpoI (1EVX) with DNA (1A73). RMSD 0.87Å, rank 2
Results Enzyme-Inhibitor docking 1 Number of highly penetrating residues in unbound structures superimposed to complex
Results Antibody-Antigen docking 1 Number of highly penetrating residues in unbound structures superimposed to complex
Description of Docking Quality Molecule Docked Protein PDB Id RMSD Energy (kcal/mol) Match Indinavir HIV-1 Protease 1hsg 1.38 -70.25 Match Xk263 Of Dupont Merck HIV-1 Protease 1hvr 2.05 -58.07 Match Vac HIV-1 Protease 4phv 0.80 -88.46 One end match, the other in different orientation Folate Dihydrofolate Reductase 1dhf 6.55 -46.02 Match 5-Deazafolate Dihydrofolate Reductase 2dhf 1.48 -65.49 Match Estrogen Estrogen Receptor 1a52 1.30 -45.86 Complete overlap, flipped along short axis 4-Hydroxytamoxifen Estrogen Receptor 3ert 5.45 -55.15 Match Guanosine-5'-[B,G-Methylene] Triphosphate H-Ras P21 121p 0.94 -80.20 Overlap, flipped along short axis Glycyl-*L-Tyrosine Carboxypeptidase A a 3cpa 3.56 -40.63 Quality of INVDOCK AlgorithmProteins. 1999; 36:1
Identification of the N-terminal peptide binding site of GRP94 • GRP94 - Glucose regulated protein 94 • VSV8 peptide - derived from vesicular stomatitis virus Gidalevitz T, Biswas C, Ding H, Schneidman-Duhovny D, Wolfson HJ, Stevens F, Radford S, Argon Y. J Biol Chem. 2004
Biological motivation • The complex between the two molecules highly stimulates the response of the T-cells of the immune system. • The grp94 protein alone does not have this property. The activity that stimulates the immune response is due to the ability of grp94 to bind different peptides. • Characterization of peptide binding site is highly important.
GRP94 molecule • There was no structure of grp94 protein. Homology modeling was used to predict a structure using another protein with 52% identity. • Recently the structure of grp94 was published. The RMSD between the crystal structure and the model is 1.3A.
Docking • PatchDock was applied to dock the two molecules, without any binding site constraints. • Docking results were clustered in the two cavities: