330 likes | 550 Views
A Cross Platform Protocol for the Focusing, Construction & Neighborhood Analysis of a Diverse High-Throughput Screening Library for Drug Discovery. Shahul H. Nilar & Nanhua Yao Structural Biology / Computer-Aided Drug Design, Ribapharm Inc., 3300 Hyland Avenue, Costa Mesa, CA 92626.
E N D
A Cross Platform Protocol for the Focusing, Construction & Neighborhood Analysis of a Diverse High-Throughput Screening Library for Drug Discovery. Shahul H. Nilar & Nanhua Yao Structural Biology / Computer-Aided Drug Design, Ribapharm Inc., 3300 Hyland Avenue, Costa Mesa, CA 92626
OUTLINE: • Brief Summary of the Drug Discovery Process • Criteria for Library Construction • “Drug-Like” Properties • Chemical Diversity • Algorithmic & Computational Requirements • Fingerprinting and Clustering • Hardware & Software • Neighborhood Analysis of Library • Structural & Pharmacaphoric Fingerprints / Similarity • Construction of Structure-Activity Relationships
Biology/ Animal study Clinical trials Strategy for Drug Discovery Target Identification / Validation Lead Identification / Validation Lead Optimization Development/ Clinical Trials Target & Assays High Throughput Screening - HTS Synthesis - Compounds & Compound Libraries Absorption, Distribution, Metabolism & Excretion ; Toxicology ChemInformatics Chemistry, Structure-Activity Relationships Lead optimization Biology/ Bioinformatics Proteomics accelerate this process
REPRESENTATIVE SET OF ALL THE MOLECULES HIGH THROUGHPUT SCREENING AGAINST TARGETS OF INTEREST SELECTION OF COMPOUNDS USING CHEMICAL DIVERSITY WHY BOTHER? FROM ALL THE MOLECULES CHOOSE A DIVERSE SUBSET OF MOLECULES
Strategy for CADD • Lead discovery libraries: high chemical diversity • exploiting real and virtual compound databases • Lead optimization library: high similarity • exploiting high throughput combinatorial chemistry • Drug like character more important than synthetic accessibility • importance of medicinal chemistry and the application of known filters “ab-initio”
CRITERIA FOR THE DEFINITION OF CHEMICAL DIVERSITYREQUIRED BENZENECYCLOHEXANE NO DIVERSITY HYDROCARBON NATURE AROMATICITY DIVERSITY
The Lipinski “Rule of Five” (1) • Molecular Weight <= 500 • # Hydrogen Bond Acceptors <=10 • # Hydrogen Bond Donors <= 5 • -2 < CLog P < 5 (or MLOGP is over 4.15). • # Rotatable Bonds <= 5- Molecular Flexibility Pharmacaphoric Properties “DRUG-LIKE BEHAVIOUR” 1: C. Lipinski et al, Adv. Drug. Del. Rev, 23, 3-25 (1997)
Favorable Drug Properties • High affinity and selectivity • Synthetic accessibility • No chemically reactive group • Oral bioavailability • Favorable pharmacokinetics • Metabolism • Elimination pathway • Lack of side effects • Lack of toxic effects
Outline of Library: • Acquire compounds from commercial suppliers based on “drug-like” and “chemically diverse ” nature • Design of novel drugs based upon the structures of proteins that play key roles in human disease, and leads discovered from High Throughput Screens • Nucleoside analog library serves as a solid foundation for this development - inherited from ICN Pharmaceuticals, Inc. • Structure & Pharmacaphore based approach.
FOCUSING OF LIBRARY 1. Drug-Like 2. Structural Diversity Chemical Space • Supplier A: purple • Supplier B: yellow • Supplier C: green • Supplier *: red
Focus: Extract a Diverse Set of Compounds for Screening from Suppliers’ Collections Purity, Availability, Cost, Overlap with other Suppliers, Periodical Updates, Reliability, Age of compounds - Screening Novelty, Stability issues Supplier A Supplier B … Supplier * Protocol - Р Diverse set of compounds for high throughput screening
P-1: Create a Master Database of ALL Compounds from ALL Suppliers • Multiple entries for each compound • Each entry | compound has multiple suppliers Supplier DB (In addition to compound DB) Issues of availability, purity, quantity, reliability, previous experiences Ribapharm Inc.: ~5.3 million compounds ~4.0 million (unique)
P-2: Filtering of Compound Database • Lipinski’s roles(1) based filter • Ribapharm - nucleoside experience| flexibility • Ribapharm - DRUG - LIKE CANDIDATESMolecular weight <= 550 # Rotatable bonds <= 7 # Hydrogen bond acceptors <= 10 # Hydrogen bond donors <= 5 -2 < CLogP < 5 • 1: C. Lipinski et al, Adv. Drug. Del. Rev, 23, 3-25 (1997) Supplier A: 44.1% Supplier C: 52.1%
P-3: Halogen Filter • Ribapharm Concern • Define: heavy halogens [ Cl, Br, I ] • Retain compounds which have a maximum of 2 heavy halogens and no fluorines • Retain compounds which have 1 heavy halogen and upto 3 fluorine atoms • 3F atoms = 1 heavy halogens (-CF3 groups) Supplier A: 43.7% Supplier C: 51.0%
P-4: Modified Reactive Group Filter(2) Supplier A: 42.4% Supplier C: 48.8% 2. Tudor I. Oprea, Comp. Aided. Mol. Des. 14, 251-264 (2000)
P-5: Structural Motif Filter (3) HTS Experience : Identification of “Promiscuous” Structural Motifs • Compounds with 3 or more NO2 groups Supplier A: 42.3%; Supplier C: 48.8% • Compounds with quinone groups Supplier A: 42.2%; Supplier C:48.7% • Intercalation compounds - Fused aromatic carbocyclic systems • Supplier A: 42.2%; Supplier C: 48.5% 3. Olivier Roche et al, J. Med. Chem. 45, 137 - 142 (2002)
P-6: Clustering of Filtered Compounds • Remove duplicate entries • Fingerprint entries using MOLECULAR FINGERPRINTS • Similarity criterion 85%, Tanimoto matrix • Cluster compounds using the Jarvis-Patrick algorithm(4) • Select the first member of each cluster Current diverse collection of compounds: ( ~130,000 ) 4. R. A. Jarvis & E.A. Patrick, IEEE Trans. Comput. C-22, 1025 - 1034 (1973)
Aspirin 154 : O in C=O 160 : CH3 group 162 : aromatic ring . . EACH STRUCTURE IS REDUCED TO A STRING OF NUMBERS - LINEAR ARRAY CRITERION FOR CHEMICAL DIVERSITY - MOLECULAR FINGERPRINTS STRUCTURAL FINGERPRINTS / KEYS EACH KEY IS A NUMBER, REPRESENTING A STRUCTURAL FEATURE
MOLECULAR FINGERPRINTS - STRUCTURAL KEYS ? • MACCS FINGERPRINTS - 166 KEYS • = {I| I = 1, 2, . . . 166 } • EACH STRUCTURE IS REPRESENTED AS A POINT IN 166 - DIMENSIONAL SPACE • MORE DIVERSE STRUCTURES HAVE A LARGER “INTER - DISTANCE“ IN THIS REPRESENTATION • MOLECULAR SIMILARITY - COMPARISON OF STRUCTURES - TANIMOTO CRITERION • TANIMOTO, T [A, B] = MOLECULE A, A & MOLECULE B, B = Intersection [A , B ] / Union [A , B ] = Intersection {IA,j B | I,j =1, 2, ….166 } Union {IA,j B | I,j =1, 2, ….166 }
APPLICATION OF JARVIS- PATRICK ALGORITHM • Calculate Similarity Matrix - S, where S[i,j] contains the similarity metric between fingerprints i and j • Similarity Thresholding - From S, create a binary matrix B such that B[i,j] has the value 1 if S[i,j] >= C, or 0 otherwise. C is the similarity threshold used to determine if two fingerprints are similar - 85% • Clustering - Matrix B - binary fingerprints of molecules. • Two molecules belong to the same cluster if the Tanimoto coefficient of their corresponding rows in B is greater than or equal to C. The Jarvis-Patrick procedure is O(n2), where n is the number of molecules
Hardware Choices: Fast, Efficient • Pharmaceutical Computing Environment: • PC (Windows / Linux), Unix/SGI(Iris), Unix/HP, SUN (Solaris) Software Choices: Fast, Efficient, Cross-Platform • Molecular Operating Environment (MOE) • from Chemical Computing Group, Inc.
Timings ……... HP: Hewlett Packard SGI®: Silicin Graphics Inc.
Distribution of Lipinski’s Observables of In-house Collection
Analytical Results of Weighted Number of Suppliers’ Compounds • Supplier A: Acquisition based on in-house QC analysis of weighted sample. Future procurement based on LC-MS criteria. • Supplier C: 50%-75% of collection have LC-MS data Acquisition : only > 90% purity LC-MS techniques
Quality Control of In-house Collection • Biannual monitoring of decomposition of compounds Chemical stability, In-house Storage Techniques, quality of compounds | Supplier • Analytical chemistry techniques: LC-MS, NMR, HPLC… • A root compound is chosen at random • Diverse [ QC ] subset (MACCS Keys) of 960 compounds chosen and this set is evaluated periodically.
Management of Suppliers’ Updates • Master database - • Continuous addition of suppliers’ updates of compounds every 8 - 10 weeks • Diverse (screening) database - • Duplicate entries between current diverse set and updates removed. • Entries filtered using protocol steps P-2 to P-5 • Diverse compounds resulting from filters fingerprinted, clustered ( P-6 ) New entries added to diverse database
Developing Strategies . . . . • Stability of compounds, reliability, purity, other experiences ….. Certain suppliers get preferential treatment • Emphasis on more recent collections • Stability of the supplier …… !!!!
Focussed Library for Screening, Drug Discovery Virology and Oncology based Therapeutic Targets Identification of Hit Candidates / Lead Scaffolds Lead Scaffolds for Structural Optimization towards the development of a Lead Candidate Structural & Pharmacaphoric Analysis Structure - Activity Relationships
PHARMACAPHORIC ANALYSIS OF IN-HOUSE DATABASE Strategy: 1. For each root compound 2.Similarity search using MACCS Fingerprints & TGD fingerprints 3. Percentage similarity 4.For each root compound structurally and pharmacaphorically similar compounds from IN-HOUSE DB
RESULTS OF NEIGHBOURHOOD ANALYSIS • USING MACCS FINGERPRINTS : ~ 75% SIMILARITY GIVES A REASONABLE NUMBER OF HITS • USING TGD FINGERPRINTS: ~85% SIMILARITY GIVES A REASONABLE NUMBER OF HITS
Summary: • Presented a dynamic protocol - P used at Ribapharm, Inc. for the acquisition, maintenance and management of a Diverse set of compounds. • Algorithm & Computational Requirements. • Neighborhood Analysis & Structure-Activity Relationship Similarity Criteria.