1 / 53

Cheminformatics in Drug Discovery and Chemical Genomics Research

UKY Seminar Weifan Zheng, Ph.D. Cheminformatics in Drug Discovery and Chemical Genomics Research. Weifan Zheng, Ph.D. Associate Professor Department of Pharmaceutical Sciences BRITE Institute, NC Central University Adjunct Associate Professor Department of Medicinal Chemistry

Download Presentation

Cheminformatics in Drug Discovery and Chemical Genomics Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UKY Seminar Weifan Zheng, Ph.D. Cheminformatics in Drug Discovery and Chemical Genomics Research Weifan Zheng, Ph.D. Associate Professor Department of Pharmaceutical Sciences BRITE Institute, NC Central University Adjunct Associate Professor Department of Medicinal Chemistry University of North Carolina at Chapel Hill

  2. UKY Seminar Weifan Zheng, Ph.D. Topics to Be Covered Biotech/Pharma Orphan Disease Chemical Genomics Computational Needs Compound Collection Docking Scoring Data Analytics CECCR Cheminformatics Center

  3. UKY Seminar Weifan Zheng, Ph.D. Drug Discovery & Development Pipeline

  4. UKY Seminar Weifan Zheng, Ph.D. Phases and Costs of Drug Discovery

  5. UKY Seminar Weifan Zheng, Ph.D. Drug Discovery Process and the Roles of CADD • GR: Genetic Research; DR: Discovery Research; DD: Drug Discovery • CADD: computer-assisted drug discovery • ADMET: Absorption, distribution, metabolism, elimination, toxicity GR DR DD Preclin II III I IND T2H H2L LO T H L C Clinical trials CADD

  6. UKY Seminar Weifan Zheng, Ph.D. Human Genome Project Success “Genome announcement 'technological triumph' Milestone in genetics ushers in new era of discovery, responsibility” CNN, June 26, 2000

  7. UKY Seminar Weifan Zheng, Ph.D. Chemogenomics/Chemical Genomics F. Collins Chris Austin

  8. UKY Seminar Weifan Zheng, Ph.D. Chemical Genomics • Chemogenomics • 69,000 in google (Oct.16, 2006) • Chemical genomics • 113,000 in google (Oct.16, 2006) • Chemical biology • 4,210,000 (Oct.16, 2006) • Chemical genetics • 104,000 (Oct.16, 2006)

  9. Chemical genetics is a research method that uses small molecules to change the way proteins work—directly in real time rather than indirectly by manipulating their genes. It is used to identify which proteins regulate different biological processes, to understand in molecular detail how proteins perform their biological functions, and to identify small molecules that may be of medical value.

  10. to create a national resource in chemical probe development. The center uses the latest industrial-scale technologies to collect data that is useful for defining the cross-section between chemical space and biological activity (and do so on genomic scale).

  11. UKY Seminar Weifan Zheng, Ph.D. NIH Molecular Library Initiative MLI Chemical Synthesis Centers MLSCN (9+1) 9 centers 1 NIH intramural 20 x 10 = 200 assays ECCR (6) Exploratory Centers PubChem (NLM) CombiChem Parallel synthesis DOS 4 centers + DPI 100K – 1M compounds SAR matrix compounds 200 assays

  12. UKY Seminar Weifan Zheng, Ph.D. Biological Assay Data • Biochemical assays • Cell-based functional assays • Phenotypic assays • Databases • PubChem (http://pubchem.ncbi.nlm.nih.gov/) • ChemBank (http://chembank.broad.harvard.edu/) • WOMBAT (http://sunsetmolecular.com/index.php) • Jubilant (http://www.jubilantbiosys.com/) • Gvk/Bio (http://www.gvkbio.com/)

  13. Rules Virtual Libraries Diverse Lib Design Targeted Lib Design Drug Discovery Chemical Genomics KDD (QSAR, P.R.) Combinatorial Synthesis Scientific Logistics SAR Data Real Libraries HTS UKY Seminar Weifan Zheng, Ph.D. High Throughput Chemistry and Screening: Informatics

  14. UKY Seminar Weifan Zheng, Ph.D. Topics to Be Covered Biotech/Pharma Orphan Disease Chemical Genomics Computational Needs Compound Collection Docking Scoring Data Analytics CECCR Cheminformatics Center

  15. R2 (3000) (3000) R1 R3 (3000) UKY Seminar Weifan Zheng, Ph.D. Challenges in Combinatorial Chemistry 3,0003 / 1,000 per week = ~0.5 million years!!! • Library Design: rational selection of a subset of building blocks to obtain a maximum amount of information

  16. UKY Seminar Weifan Zheng, Ph.D. Design for Activity: Similarity • If we know a compound is active, and we want to design a set of compounds that may be active against the same target, we may select • A set of compounds that are similar to the active compound • The similarity principle: similar compounds should have similar biological activity

  17. X X X X • • • 1 2 3 20 Str. 1 2 5 1 4 • • • Str. 2 4 7 9 7 • • • Str. 3 1 6 8 6 • • • • • • • • • • • 2 • • • • • • • • 1 • • • • • • • • 3 X2 Str.100 0 3 5 • • • 1 UKY Seminar Weifan Zheng, Ph.D. X1 Molecular Identity and Molecular Similarity

  18. UKY Seminar Weifan Zheng, Ph.D. Design for General Application: Diversity

  19. UKY Seminar Weifan Zheng, Ph.D. Similarity and Diversity - Maxi Min - Minimize (Sum 1/Dij*Dij)

  20. UKY Seminar Weifan Zheng, Ph.D. Cluster Hits Obtained by SAGE and Random Sampling

  21. UKY Seminar Weifan Zheng, Ph.D. Drug Discovery & Development Failures 6% 21% 39% 29% Venkatesh & Lipper, J. Pharm. Sci. 89, 145-154 (2000)

  22. UKY Seminar Weifan Zheng, Ph.D. Multi-Factorial Design

  23. UKY Seminar Weifan Zheng, Ph.D. Total Score is the Weighted Sum of Individual Terms

  24. R1 R2 R1 R2 Better Library Initial Library R1 R2 Optimal Library R1 Penalty Scores R2 P450 Activity Lipinski Properties Diversity Iteration

  25. Designed Library Has a Better MW-clogP Distribution clogP Initial Ten solutions (undesigned) The final ten solutions (well designed)

  26. X X X X • • • 1 2 3 20 Str. 1 2 5 1 4 • • • Str. 2 4 7 9 7 • • • Str. 3 1 6 8 6 • • • • • • • • • • • 2 • • • • • • • • 1 • • • • • • • • 3 X2 Str.100 0 3 5 • • • 1 UKY Seminar Weifan Zheng, Ph.D. X1 Molecular Identity and Molecular Similarity

  27. UKY Seminar Weifan Zheng, Ph.D. SPE Algorithm (Agrafiotis) • Iterative Random Sampling D(a,b) D’(a,b) b a Embedding Space (2D) Original Space If D’ > D, move a, b closer If D’ < D, move a, b apart

  28. UKY Seminar Weifan Zheng, Ph.D. Chemical Space - Compound Collection Comparison

  29. UKY Seminar Weifan Zheng, Ph.D. Chemical Space - Compound Collection Comparison

  30. UKY Seminar Weifan Zheng, Ph.D. Chemical Space - Compound Collection Comparison

  31. UKY Seminar Weifan Zheng, Ph.D. SPE Embedding of ChemSpace

  32. UKY Seminar Weifan Zheng, Ph.D. Topics to Be Covered Biotech/Pharma Orphan Disease Chemical Genomics Computational Needs Compound Collection Docking Scoring Data Analytics CECCR Cheminformatics Center

  33. . . . . . . . . . . actual actual . . . . . . . UKY Seminar Weifan Zheng, Ph.D. predict predict Quantitative Structure-Activity Relationship (QSAR) q2=0.8 R2=0.75 Multiple Linear regression (MLR); partial least square (PLS); Artificial neural nets; k-nearest neighbor (kNN)

  34. UKY Seminar Weifan Zheng, Ph.D. Basic Assumptions of KNN-QSAR Method • Structurally similar compounds should have similar biological activities • Biological similarities are often due to similarities of substructures (pharmacophore) • Biological activities can be estimated from molecular similarities, which are calculated with pharmacophore-specific descriptors

  35. UKY Seminar Weifan Zheng, Ph.D. Comparison of CoMFA, GA-PLS, and KNN-QSAR

  36. UKY Seminar Weifan Zheng, Ph.D. QSAR Based Virtual Screening for GPCR Ligand Design

  37. UKY Seminar Weifan Zheng, Ph.D. Topics to Be Covered Biotech/Pharma Orphan Disease Chemical Genomics Computational Needs Compound Collection Docking Scoring Data Analytics CECCR Cheminformatics Center

  38. Docking and Scoring • Early 1980’s, Kuntz, I.D. developed the first computerized molecular docking program: DOCK • GOLD, FRED, GLIDE, FLEXX, AutoDock, ICM X-ray structure

  39. UKY Seminar Weifan Zheng, Ph.D. Our Approach to Derive DT-SCORE 1. Use Delaunay tessellation to derive geometrical chemical descriptors of protein ligand interface 2. Establish correlation between the geometrical chemical descriptors and protein-ligand binding affinity using Perceptron Learning algorithm

  40. Receptor-ligand Complexes Tessellation of receptor -ligand interface Descriptor Generation Perceptron Learning algorithm Model Generation & Prediction DT-SCORE UKY Seminar Weifan Zheng, Ph.D. Flowchart to Derive DT-SCORE Binding constant

  41. UKY Seminar Weifan Zheng, Ph.D. Delaunay Tessellation in 2D • Rigorous definition of nearest neighbors in 2D & 3D space - Delaunay tessellation Nearest neighbors are unambiguously defined in sets of three (in 2D) and in sets of four (in 3D)

  42. UKY Seminar Weifan Zheng, Ph.D. Delaunay Tessellation of the Receptor-Ligand Interface

  43. R R R R R A Detailed View of Active Site Tessellation L An atom is shared by several tetrahedra

  44. UKY Seminar Weifan Zheng, Ph.D. 3 Types of Tetrahedra at the Receptor-Ligand Interface RLLL RRLL RRRL RLLL: Formed by 1 receptor atom and 3 ligand atoms RRLL: Formed by 2 receptor atoms and 2 ligand atoms RRRL: Formed by 3 receptor atoms and 1 ligand atom Each of the above tetrahedron types is further discriminated by atom types on the vertices

  45. RRLL RRRL RLLL UKY Seminar Weifan Zheng, Ph.D. Geometrical Descriptors According to Tetrahedron Types …… …… …… NOCS COSC CNOO NCNO OSXN ONOS …… …… 4 0 …… 2 8 5 3

  46. UKY Seminar Weifan Zheng, Ph.D. ( R·L Interaction Pattern – Binding Affinity Relationship Table) “QSAR” Input Table

  47. x1 1 w1 x2 2 w2 y w3 x3 3 wN xN N Single-Layer Perceptron Network Input Layer Output Layer xi = input of neuron wi= weight associated with the input xi fn(.) = Activation function of output neuron.

  48. UKY Seminar Weifan Zheng, Ph.D. Training Vs. Test Set Selection and Validation Entire dataset (264 complexes) 80% (214 complexes) 20% (50 complexes) Test set Training set Prediction of the test set (R2) Model development (q2)

  49. UKY Seminar Weifan Zheng, Ph.D. Model Stability • Average value from multiple (ca. 80) models

More Related