430 likes | 572 Views
Sourav Das Department of Chemistry & Chemical Biology Rensselaer Polytechnic Institute, Troy, NY. Property-Encoded Shape Distributions. Overview. Part I. PESD signatures for binding site comparison Part II. PESD signatures for binding affinity prediction (PESD-SVM)
E N D
Sourav Das Department of Chemistry & Chemical Biology Rensselaer Polytechnic Institute, Troy, NY Property-Encoded Shape Distributions
Overview • Part I. PESD signatures for binding site comparison • Part II. PESD signatures for binding affinity prediction (PESD-SVM) • Conclusions & Further Work
Motivation Similar? • Protein binding site similarity analysis can complement ligand based rational drug discovery efforts: • structure-based ligand design – finding alternate ligand substructures from ligand bound sites • (ii) predicting ligand cross-reactivity – a ligand recognizing two different proteins which have binding-site similarity • Gold, N.D.; Deville, K.; Jackson, R.M. Biochem. Soc. Trans. 2007, 35, 561.
Binding Site Representation Atom/pseudoatom/amino acid (traditional) or surface based? Advantages of Surface representation • Accounts for steric-interaction surface (3D surface shape) • Molecular Electrostatic Potential mapping possible • Closer to what a ligand “sees” – mapped surface properties free of residue or atom labels and their orientations However, surfaces are computationally difficult to compare!
NP-complete, slow Alignment required Scaling + separate treatment of positive & negative magnitudes Techniques for comparison • Molecular Surface based • Clique Detection – eF-Site • Spherical Harmonics • 3D-Zernike • Random Sampling & Binning – PESD • Rotation and translation invariant – no alignment required • Not scale invariant – suitable for direct comparison of binding sites • Simultaneous treatment of positive and negative magnitudes • Suitable for high-throughput screening
Property-Encoded Shape Distributions EP ActiveLP • Conversion of property distribution on surfaces to a string of numbers • Two property mapped Gauss-Connolly surfaces encoding MEP, Hydrogen bonding, polar and hydrophobic regions generated in MOE
Property-Encoded Shape Distributions • Large number of randomly selected pairs of points from the surface for convergence and binned by distance & property combinations M1 d M2 Osada R, Funkhouser T, Chazelle B, Dobkin D. Shape Distributions. ACM Trans. Graph. 2002, 21, 807
d M1M2 Property-Encoded Shape Distributions • Large number of randomly selected pairs of points from the surface for convergence and binned by distance & property combinations M1 d M2
Property-Encoded Shape Distributions • MOE surfaces are triangulated and color coded (representative of property and its magnitude) at each vertex • To choose a surface point in an unbiased way: • The selected point is assigned the property magnitude of its nearest vertex • Store triangles as array of cumulative areas • Randomly choose a value x between 0 and total area • For the triangle in the array having x within bounds of its cumulative area, calculate a point, such that:
Property-Encoded Shape Distributions • Signature comparison by chi-squared distance • Final distance score weighted sum of EP and ActiveLP distance • 1-2 minutes for signature computation, 100+ comparison of pre-computed signatures per second ; Weight
How to determine the optimum weights? • Binding sites from 40 different proteins bound to 4 different types of ligands – ATP, NADP, steroid, heme (Morris et al. Bioinformatics 2005;21:2347-2355) • Weight parameter for ActiveLP distance systematically varied by 0.1 unit • L1 and L2 metrics also tested for distance L1: r=1 L2: r=2
How to determine the optimum weights? Best Clustering of Binding Sites obtained Higher Accuracy than PocketMatch (Yeturu et al. BMC Bioinformatics, 2008, 9, 543)
Virtual Screening with PESD Binding Site Query Database Sorted List of matches Binding Sites
“correct match” – No straightforward definition Analyzing results • Are amino acid compositions similar? • Are the bound ligands similar? • Do the ligands (from two different proteins) bind onto a common site? • Do the binding sites belong to the same functional class of proteins?
Case Studies Glucose/Galactose receptor Glucokinase High similarity of amino-acids / Low similarity in orientations
Case Studies 1b55 1btn (IP binding) (IP binding) Low similarity of amino-acids / Low similarity in orientations
Case Studies 2jav 1cdk (cAMP dependent Kinase) (Nek2 Kinase) SU11248 (Sunitinib, Sutent) ROCS ligand similarity 0.488 : Shape 0.660 : Combo Cl ANP SU11652 Ligands dissimilar
Overlap of inhibitor binding site with ATP binding site in Nek2 SU11652 ( Nek2 - 2jav)
Overlap of inhibitor binding site with ATP binding site in Nek2 SU11652 ( Nek2 - 2jav) ANP (1cdk) Jun 2008
Overlap of inhibitor binding site with ATP binding site in Nek2 SU11652 ( Nek2 - 2jav) ATG (Nek2 - 2w5b) ANP (1cdk) Dec 2008
Case Studies 1aer 1isi BST-1 Pseudomonas Aeruginosa exotoxin Ligand sub-structural similarity from binding site similarity
Case Studies 1bq4 2fvv Human diphosphoinositol polyphosphate phosphohydrolase 1 Phosphoglycerate mutase Ligand sub-structural similarity from binding site similarity
Case Studies 1gwq 1lhu (Estrogen receptor) (Sex hormone binding globulin) All alpha proteins All beta proteins Bound Estradiol Bound Raloxifene Core
Case Studies Similarity among ATP binding sites: cross-reactivity of a promiscuous binder ATP cAMP-dep Pr. Kinase RNA-editing Ligase TRP-related protein Pyruv. De. Kinase Topoisomerase II
Clustering • 19 binding sites from TIM β/α barrel protein-ligand complexes • PESD is fold-independent Group 1 Group 2 Group 3 Sael et al. Rapid comparison of properties on protein surface. Proteins 2008, 73, 1.
Clustering Looking at what the bound ligands are … 1fdj Group 1 1b57 and 1fdj – structural analogs bound to protein Fructose-1,6-bisphosphate aldolase 1b57
Clustering Looking at what the bound ligands are … 1rhc 1fcb Group 2 1fcb and 1rhc – coenzymes (FMN and F420) bound to dehydrogenases
Clustering Looking at what the bound ligands are … Group 3 Mostly poly-ols 1k4g PDB:1k4g Natural Ligand
Part II. PESD signatures for binding affinity prediction – PESD-SVM
Background • Binding affinity prediction: a complex problem due to enthalpic and entropic contributions from protein side chains, ligand and solvent • However, rapid computation of affinity necessary for structure based drug design • Scoring functions are designed for rapid affinity prediction - force-field based or knowledge based potentials based or empirical • PESD signatures + SVM model building = empirical scoring function
Background • Currently PESD-SVM considers only surface signatures, no explicit treatment for entropy or solvent • However, apolar surface area burial seems to correlate well with affinity and PESD signatures account for buried apolar surface • Enthalpic factors are dominant in many systems, especially so for natural ligands binding to proteins Olsson et al. J. Mol. Biol., 2008, 384,1002
Ligand interaction surface Protein interaction surface EP Active LP H1 H2 H3 H4 Features PESD-SVM • Signatures for both ligand and protein contact regions • SVM models built • No feature selection • Applied to prediction of binding affinities of the PDBBind dataset (Wang et al.J. Med. Chem. 2004, 47, 2977)
PESD-SVM Binding affinity prediction * Sotriffer et al. Proteins 2008, 73, 395
PESD-SVM Binding affinity prediction
PESD-SVM Classification into strong/medium/weak binders
PESD-SVM Scoring of docked poses PDB: 1cbx
PESD signatures are easy-to-compute capturing three-dimensional shape and mapped properties on molecular surfaces • Fast & accurate similarity search of binding sites • Possible to detect similarities in binding sites when bound ligand similarity is low, thereby complementing ligand based drug discovery efforts • Application in finding alternate ligand substructures, predicting cross-reactivity of bound sites and affinity prediction
Current / Future efforts • Segmenting a binding site for local comparison & partial matching • Comparison of unbound sites for protein function prediction • Adding entropic and solvent terms to PESD-SVM
Acknowledgements Dr N. Sukumar and Dr. Dominic Ryan Breneman group members NIH – (1P20 HG003899) THANK YOU