550 likes | 728 Views
Chemical Data and Computer-Aided Drug Discovery. Mike Gilson School of Pharmacy mgilson@ucsd.edu 2-0622. Outline. Overview of drug discovery Structure-based computational methods When we know the structure of the targeted protein Ligand-based computational methods
E N D
Chemical Data and Computer-Aided Drug Discovery Mike GilsonSchool of Pharmacy mgilson@ucsd.edu 2-0622
Outline Overview of drug discovery Structure-based computational methods When we know the structure of the targeted protein Ligand-based computational methods When we don’t know the protein’s structure
Small Molecule Drugs Aspirin Sildenafil (Viagra) Taxol Darunavir Glipizide (Glucotrol) Digoxin
Nanoparticles(e.g., packaged small-molecule drugs) Doxil(liposome package, extended circulation time,milder toxicity) Abraxane(albumin-packaged taxol) http://www.doxil.com/about_doxil.html http://www.abraxane.com/professional/nab-technology.aspx
Biopharmaceuticals Etanercept (Enbrel)Protein with TNF receptor + AbFc domain Scavenges TNF, diminishes inflammation Erythropoietin (EPO)Stabilized variant of a natural protein hormone http://www.ganfyd.org/index.php?title=Erythropoietin_beta http://en.wikipedia.org/wiki/File:Enbrel.jpg
Natural Products Aspirin Digoxin Taxol Pacific Yew Foxglove Willow
How Aspirin Works Aspirin inflammation platelet activation platelet inactivation lipidlibrary.aocs.org/lipids/eicintro/index.htm
Biomolecular Pathways and Target SelectionE.g. signaling pathways Target protein http://www.isys.uni-stuttgart.de/forschung/sysbio/insulin/index.html
Empirical Path to Ligand Discovery Compound library(commercial, in-house,synthetic, natural) High throughput screening(HTS) Hit confirmation Lead compounds(e.g., µM Kd) Lead optimization (Medicinal chemistry) Animal and clinical evaluation Potent drug candidates(nM Kd)
Compound Libraries Government (NIH) Commercial (also in-house pharma) Academia
Computer-Aided Ligand Design Aims to reduce number of compounds synthesized and assayed Lower costs Less chemical waste Faster progress
1. We Know the Structure of the Targeted ProteinStructure-Based Ligand Discovery HIV Protease/KNI-272 complex
Protein-Ligand Docking Structure-Based Ligand Design Potential functionEnergy as function of structure Docking softwareSearch for structure of lowest energy VDW - + Screened Coulombic Dihedral
Energy Determines Probability (Stability)Boltzmann distribution Energy Probability x
Structure-Based Virtual Screening 3D structure of target(crystallography, NMR, modeling) Compound database Virtual screening(e.g., computational docking) Candidate ligands Ligand optimizationMed chem, crystallography, modeling Experimental assay Ligands Drug candidates
Fragmental Structure-Based Screening 3D structure of target(crystallography, NMR, modeling) “Fragment” library Fragment docking Compound design Experimental assay and ligand optimizationMed chem, crystallography, modeling Drug candidates http://www.beilstein-institut.de/bozen2002/proceedings/Jhoti/jhoti.html
Potential Functions for Structure-Based Design Energy as a function of structure Physics-Based Knowledge-Based
Physics-Based PotentialsEnergy terms from physical theory Van der Waals interactions (shape fitting) Bonded interactions (shape and flexibility) Coulombic interactions (charge-charge complementarity) Hydrogen-bonding
Common Simplifications Used in Physics-Based Docking Quantum effects approximated classically Protein typically held rigid Configurational entropy neglected Influence of water treated crudely
Proteins and Ligand are Flexible Protein Ligand Complex DGo +
Binding Energy and Entropy EFree Unbound states EBound Bound states Entropy part Energy part
Structure-Based Discovery Physics-oriented approaches • Weaknesses • Fully physical detail becomes computationally intractable • Approximations are unavoidable • Parameterization still required • Strengths • Interpetable, provides guides to design • Broadly applicable, in principle at least • Clear pathways to improving accuracy • Status • Useful, far from perfect • Multiple groups working on fewer, better approxs • Force fields, quantum • Flexibility, entropy • Water effects • Moore’s law: hardware improving
Knowledge-Based Docking Potentials Histidine Ligandcarboxylate Aromaticstacking
Probability Energy Boltzmann: Inverse Boltzmann: Example: ligand carboxylate O to protein histidine N Find all protein-ligand structures in the PDB with a ligand carboxylateO For each structure, histogram the distances from O to every histidineN Sum the histograms over all structures to obtain p(rO-N) Compute E(rO-N) from p(rO-N)
Knowledge-Based Docking Potentials “PMF”, Muegge & Martin, J. Med. Chem. 42:791, 1999 A few types of atom pairs, out of several hundred total Atom-atom distance (Angstroms)
Structure-Based Discovery Knowledge-based potentials • Weaknesses • Accuracy limited by availability of data • Accuracy may also be limited by overall approach • Strengths • Relatively easy to implement • Computationally fast • Status • Useful, far from perfect • May be at point of diminishing returns
Limitations of Knowledge-Based Potentials 1. Statistical limitations (e.g., to pairwise potentials) 2. Even if we had infinite statistics, would the results be accurate? (Is inverse Boltzmann quite right? Where is entropy?) 100 bins for a histogram of O-N & O-C distances rO-C rO-N 10 bins for a histogram of O-N distances rO-N r2 r1 … r10
2. We Lack the Structure of the Targeted ProteinLigand-Based Discovery e.g. MAP Kinase Inhibitors Using knowledge of existing inhibitors to discover more
Scenarios for Ligand-Based Discovery Experimental screening generated some ligands, but they don’t bind tightly A company wants to work around another company’s chemical patents An otherwise promising compound is toxic, is not well-absorbed, etc.
Ligand-Based Virtual Screening Compound Library Known Ligands Molecular similarity Machine-learning Etc. Candidate ligands OptimizationMed chem, crystallography, modeling Assay Actives Potent drug candidates
Sources of Data on Known LigandJournals, e.g., J. Med. Chem.
Some Binding and Chemical Activity Databases PubChem (NIH) pubchem.ncbi.nlm.nih.gov ChEMBL (EMBL) www.ebi.ac.uk/chembl BindingDB (UCSD) www.bindingdb.org
BindingDB www.bindingdb.org
Finding Protein-Ligand Data in BindingDB e.g., by Name of Protein “Target” e.g., by Ligand Draw Search
Sample Query Results Download data inmachine-readableformat
Machine-Readable Chemical Format Structure-Data File (SDF) PDB Format Lacks Chemical Bonding SDF Format Defines Chemical Bonds
There are Many Other Chemical File FormatsInterconvert with Babel
Chemical SimilarityLigand-Based Drug-Discovery Compounds(available/synthesizable) Similar Compare with known ligands Test experimentally Different Don’t bother
Chemical FingerprintsBinary Structure Keys carboxylate … aldehyde naphthyl S-S bond chlorine fluorine alcohol methyl ketone phenyl amide ethyl Molecule 1 Molecule 2
Chemical Similarity from FingerprintsTanimoto Similarity or Jaccard Index, T Molecule 1 Molecule 2 NI=2 Intersection NU=8 Union
Hashed Chemical Fingerprints Based upon paths in the chemical graph 1-atom paths: C F N H S O 2-atom paths: F-C C-C C-N C-S S-O C-H 3-atom paths: F-C-C C-C-N C-N-H C-S-O Each path sets a pseudo-random bit-pattern in a very long molecular fingerprint C S-O etc.
Maximum Common Substructure Ncommon=34
Potential Drawbacks of Plain Chemical Similarity May miss good ligands by being overly conservative Too much weight on irrelevant details
Scaffold Hopping Identification of synthetic statins by scaffold hopping Zhao, Drug Discovery Today 12:149, 2007
Abstraction and Identification of Relevant Compound Features Ligand shape Pharmacophore models Chemical descriptors Statistics and machine learning
Pharmacophore Models Φάρμακο (drug) + Φορά (carry) A 3-point pharmacophore Bulky hydrophobe 3.2 ±0.4 Å 5.0 ±0.3 Å + 1 Aromatic 2.8 ±0.3 Å