Characterization and prediction of drug binding sites in proteins

Characterization and prediction of drug binding sites in proteins YarivBrosh & Alex Fardman Advisor: Dr. YanayOfran

Concepts • Background • Goals • Methods • Results • Conclusions • Implications

Background • Proteins – organic compounds that constitute the basic functional and computational unit in the cell. They are able to bind other molecules specifically and tightly. • Pocket – The region of the protein responsible for binding. • Ligand – Substance that is able to bind to a biomolecule. • Drug – Substance that alters normal body function.

Background • Most drugs achieve their effects by binding to a protein at a specific binding site and modifying its activity. • One may want a drug that binds to a specific location in a protein to prevent side effects. • Identifying those binding sites in proteins experimentally is time & resource consuming.

Our goal Find a way to predict whether a drug will bind to a protein or not. This will shorten the drug development time significantly…

Methods • Collecting data – Pocket creation • Choosing attributes & analysis of pockets accordingly • Machine Learning

Collecting Data Choosing drugs (ligands): Choosing Proteins:

Positive data set

Negative data set

Negative data set Docking algorithm: method to predict binding orientation.

Attributes & pocket analysis • Count the number of each amino acid. • Charge in physiological PH. • Shape matching = • Connectivity =

Accessibility calculation • With : • Accessibility calculation is done by simulation of rolling water molecules over the protein surface.

Attributes & pocket analysis • With : • Accessibility difference of protein atoms before binding the ligand and after. • Accessibility difference of ligand atoms before binding to the pocket and after. • With HBPLUS: • Number of hydrogen bonds between ligand and pocket.

Positive set size: 285 Negative set size: ~10,600 Number of Attributes: 26 SVM

Machine learning With WEKA – using LibSVM: Training: True: 200 False1,000-10,000 Data Testing: True: 85 False: 544-9,544

Results

Conclusions • We were able to distinguish between real & non-biological binding sites without using computationally expensive energy functions or evolutionary conservation. • It is not possible to distinguish between binding sites with PatchDock alone. • Using the combination of simple and computationally “cheap” tools such as SVM, PatchDock and the algorithms for pocket analysis mentioned earlier, it is possible to give a good prediction regarding the nature of the binding site. • The advantage of the method is its simplicity: Taking the best docking conformations and comparing with characteristics of real and non-biological binding sites. (No need to compare entire proteins).

Conclusions • The few negative binding sites classified as positives may be potentially real binding sites. (Need to be checked experimentally). • The method can be improved and refined: • More attributes • More drugs and proteins • Analysis of attribute significance • Bigger learning set • Bigger positive set in relation to the negative set in the learning set (help the learning algorithm)

Implications • The tool can be used to check possible side effects during drug development. • Drug Repurposing - Find new targets for existing drugs. • Can significantly shorten the drug toxicity check during development.

Thanks • Dr. YanayOfran • Dr. Olga Leiderman • Dr. Guy Nimrod • VeredKunik • RotemSnir • Sivan Ophir • For your dedicated help!

Characterization and prediction of drug binding sites in proteins

Characterization and prediction of drug binding sites in proteins

Presentation Transcript

Being a binding site: Characterizing Residue-Composition of Binding Sites on Proteins

Immunoglobulins as Binding Proteins

Measuring Ion Binding of Proteins in Aqueous Media

Protein Function: Oxygen Binding Proteins

Workload Characterization and Prediction

PURIFICATION AND CHARACTERIZATION OF PROTEINS

Transport: Binding Proteins

Drug distribution and protein binding

Analysis and Characterization of Nucleic Acids and Proteins

Drug-Protein Binding

Calcium binding proteins

Oxygen Binding Proteins

Folding and Misfolding Initiation Sites in Proteins

A Computational Method to Identify RNA Binding Sites in Proteins

Cleavage sites and binding affinities

TEAD1 binding sites in Gli2 promoter

GTP-binding proteins and protein phosphorylation

Identification and Characterization of Metal Ions in Proteins

Oxygen Binding Proteins

Characteristics of Sugar Binding Sites of Enzymatic Proteins

Glycan Binding Proteins