1 / 27

Outline

MultiY Recursive Partitioning – Method and Applications Robert Brown, Shashidhar Rao, Tom Stockfisch, Accelrys Inc David Roush, Litai Zhang, FMC UK-QSAR meeting – June 2002. Outline. Introduction PUMP-RP Methodology Selectivity Study – COX2 inhibitors HTS study – FMC Summary. Introduction.

gwenllian
Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MultiY Recursive Partitioning – Method and ApplicationsRobert Brown, Shashidhar Rao, Tom Stockfisch, Accelrys IncDavid Roush, Litai Zhang, FMCUK-QSAR meeting – June 2002

  2. Outline • Introduction • PUMP-RP Methodology • Selectivity Study – COX2 inhibitors • HTS study – FMC • Summary

  3. Introduction • High-throughput chemistry and biology are creating a wealth of data that can lead to knowledge to expedite the drug-discovery process • Requirement for high-throughput methods to model HTS data for in-silico screening • HTS data is characterized by huge number of observations, low hit rates, lots of noise • Need high-speed methods for prediction • Recursive Partitioning (CART, FIRM), Linear Discriminant Analysis, Neural Nets, Binary QSAR etc • Would like to understand trends and selectivity across assays • mine the HTS data matrix

  4. Standard CART RP • Input • Multiple descriptors (X) - continuous or categorical • Single screening result (Y) - categorical (e.g. yes/no) • Decision tree aims to separate different types of observation into different leaves of the tree • Two step procedure:(1) overgrow then (2) prune • decrease impurity during growth phase • choose split with greatest drop in impurity • stepwise procedure w/ no look-ahead • examines only a small fraction of possible trees • over-grows the tree 1 • decrease R during pruning • R  R0 +  Nterminal • stepwise procedure finds optimum R over all possible subtrees of the overgrown tree 2

  5. Understanding Selectivity? Target 1 Target 2 I I A I A I A A I I A A I A I A • Hard or impossible to compare trees to see what produces selectivity • Requires enough data to determine two separate trees I

  6. I1 A1 I2 I2 A1 I1 I1 I2 A2 I2 A2 I2 A2 I2 PUMP-RP Yk-generic splits • One tree combines both responses • Easy to see what makes a molecule selective • Easy to see what the targets have in common • Twice the activity data available to determine generic portion • Use for Specificity (e.g., Y1, Y2 different targets) • Use for multi-physical models (e.g., Y1= activity, Y2= toxicity) Activity -type splits Yk-specific splits • Partially Unified Multiple Property Recursive Partitioning: A New Method for Predicting and Understanding Drug Selectivity, Thomas Stockfisch in preparation for J. Chem. Inf. Comput. Sci.

  7. Yk=1 K-split wins X2 .0.61 X1.61 X10.41 2,I X2 0.41 1,I Multi-Y Single-Y plus K column X12a0.91 X12b 0.91 X1 X2 Y1 Y2 X1 X2 K Y 1,I 2,I 0.1 0.2 I I 0.1 0.2 1 1,I 0.1 0.2 2 2,I 0.2 0.4 A I 0.2 0.4 1 1,A 0.2 0.4 2 2,I 0.3 0.6 A unk 0.3 0.6 1 1,A X12b  0.11 X12a 0.11 2,I 1,I 2,A 1,A 1,I 2,I X Yk separate Y1 model separate Y2 model X X Yk Yk X X X New algorithm • Obtain a balance between a single general tree and a series of unrelated specific trees • Procedure • 1. Map data to a single Y variable • 2. Grow a pure specific tree - k node at level 1 • 3. Regrow a k-branch - save the k split and replace with a non-k split • 4. Recursively repeat step 3 moving the k-nodes “down” until arriving at the maximally generic tree • 5. Prune the generic tree - replace some generic branches with specifics • 6. Find the optimal tree to balance specificity and generality

  8. Outline • Introduction • PUMP-RP Methodology • Selectivity Study – COX2 inhibitors • HTS study – FMC • Summary

  9. Selectivity Study: COX-2 selectivity • Cyclooxygenase (COX) is a key enzyme in the prostaglandin biosynthesis via the pathway of arachadonic acid breakdown. • Two isoforms, COX-1 (constitutive) and COX-2 (triggered by inflammatory insults) are known and characterized. • COX-2 inhibitors are anti-inflammatory agents with minimal GI side-effects. • Celebrex and Vioxx • Inhibition of COX-1 can lead to gastric damage, hemorrhage or ulceration • NSAIDS e.g Iboprofen, Aspirin etc Partially Unified Multiple Property Recursive Partitioning (PUMP-RP) Analyses of Cyclooxygenase (COX) Inhibitors, Shashidhar N. Rao &Thomas P. Stockfisch in preparation for J. Chem. Inf. Comput. Sci.

  10. Study Input • 454 Diaryl heterocycle cyclooxygenase (COX) inhibitors with phenyl sulfones & phenyl sulfonamides from published literature. • Inhibitory activities (IC50) against COX-1 and COX-2 isoforms of the enzyme. • Divided into 2 classes for each target: • COX-1 - IC50 > 5 M (Class 0). IC50 <= 5 M (Class 1) • COX-2 - IC50 > 0.5 M (Class 0). IC50 <= 0.5 M (Class 1) • Divided into • Test set (TE) of 50 compounds: 17 COX-2 selective • Training set (TR) of 404 compounds: 181 COX-2 selective. • External validation sets • 25 Merck cyclooxygenase inhibitors • represents a different class of chemistry than that covered by the training and test sets • 8 NSAIDs (aspirin, ketoprofen, naproxene, desmethylnaproxene, ibuprofen, indomethacin, phenytoin and diclofenac) • all active and non-selective

  11. generic split Yk = 1 split TRUE Specific split FALSE I1 (125) Example Tree COX-2 selective A2 (30) I2 (95) HB Donor <=1 Jurs-FNSA-3 <= -0.2 A2 (112) A1 (112) I2 (61) AlogP98 <=3.1 I1 (61) ISIS_key59 I2 (6) A1 (6) JY <=2.083 A2 (100) A1 (100)

  12. Why not just calculate two trees? A2 (127) FH2O <=-30.1 AlogP98 <= 2.6 I2 (4) I2 (4) Apol <=14051.8 COX-2 Inhibition A2 (9) JX <=2.01 ISIS Key #75 I2 (231) I2 (29) I1 (148) Dipole Mom. <=5.87 JX <= 1.79 COX-1 Inhibition A1 (8) ISIS Key #94 A1 (6) A1 (9) A1 (104) Shdw-XZ fract <= 0.7 ISIS Key #66 Shdw-nu <= 2 AlogP98 <= 3.1 A1 (36) I1 (63) I1 (30)

  13. Prediction of Selectivity • Percentage of actives correctly predicted by RP trees compared to experiment • Enrichment in Cox2 selectives • 1.56 to 1.86 in the training set (TR) • 1.60 to 2.29 in the test set (TE) • Remember: 44% of TR is Cox2 selective, so the best possible enrichment in TR would be ~2.2

  14. SRfp SRfn False positive and negative selectivity rates

  15. External Validation Sets • 25 Merck compounds – 21 actives including 13 COX2 selective, 4 inactive • 21 correctly predicted COX2 active, 8 correctly predicted COX1 active • 8 correctly predicted COX2 selective • Correctly predict that none are COX1 selective • 8 NSAIDs: aspirin, ketoprofen, naproxene, desmethylnaproxene, ibuprofen, indomethacin, phenytoin and diclofenac. • All predicted to be non-selective • five of them (ketoprofen, naproxene, ibuprofen, indomethacin and diclofenac) are predicted to be active • three including aspirin predicted inactive • Aspirin is a weak inhibitor of both COX 1 and 2 (IC50 ~ 150-300 nM)

  16. Outline • Introduction • PUMP-RP Methodology • Selectivity Study – COX2 inhibitors • HTS study – FMC • Summary

  17. Assay Enrichment Study • 66000 FMC compounds library screened in two functional assays (I and II) returning two classes of activity (0 and 1) • Assay I has two follow up assays [I(1); I(2); I(3)] • 60, 33, 24 actives respectively • Assay II has one follow up assay[II(1); II(2)] • 109, 12 actives respectively • X(1) is a primary assay, whilst (2) and (3) are related to specific mechanisms • Goal • Combine multiple data from multiple assays for endpoint X to • Explain factors causing activity • Use maximum data to get best predictive model

  18. Computational Protocol • The 66000 compounds were divided in half for training and test sets with even distributions of actives/inactives for both assays • Six sets of descriptors • Bcuts (8), • Cerius2 Fast descriptors (199), • Jurs descriptors (30), • ISIS keys (166), • 3D Atom pairs (825) • CCG-2D (145) Mining Large Databases Using Multiple Y Recursive Partitioning, David Roush, Litai Zhang, Thomas Stockfisch and Shashidhar Rao, in preparation for J. Chem. Inf. Comput. Sci.

  19. Single Y vs Multi Y – Cerius2 Descriptors Test Set Results

  20. Single Y vs Multi Y – ISIS Keys Test Set Results

  21. Single Y vs Multiple Y • Multi Y produces better enrichments with better false positive rates • Single Y produces better false negative rates • => More information has produced a more selective screen • Logistically, only one experiment to run • Multi Y allows the factors/descriptors important to all assays to be identified

  22. PUMP-RP - Assays I(1)

  23. PUMP-RP - Assays I(2)

  24. PUMP-RP - Assays I(3)

  25. PUMP-RP - Assays I (all assays)

  26. Summary • PUMP-RP procedure creates tree with target-generic splits near the root, target-specific splits near the leaves, and separated by splits on the activity type. • the generic splits benefit from being determined by a larger amount of data than if separate models were made • easy to interpret which splits determine specificity and which show commonality of target • Prediction and understanding of COX-2 selective molecules • Large scale experiments with FMC show use of multiple assay data to enhance understanding of activity • Commercial released in Cerius2 4.6

  27. Forthcoming Publications • Methodology • Partially Unified Multiple Property Recursive Partitioning: A New Method for Predicting and Understanding Drug Selectivity, Thomas Stockfisch, in preparation for J. Chem. Inf. Comput. Sci. • COX Selectivity Study • Partially Unified Multiple Property Recursive Partitioning (PUMP-RP) Analyses of Cyclooxygenase (COX) Inhibitors, Shashidhar N. Rao &Thomas P. Stockfisch in preparation for J. Chem. Inf. Comput. Sci. • FMC HTS Study • Mining Large Databases Using Multiple Y Recursive Partitioning, David Roush, Litai Zhang, Thomas Stockfisch and Shashidhar Rao. in preparation for J. Chem. Inf. Comput. Sci.

More Related