1 / 22

Compound Set Enrichment

Compound Set Enrichment. A novel approach to analysis of primary HTS data. Thibault Varin. Ansgar Schuffenhauer. Gubler, H., Parker, C., Zhang, JH., Raman, P., Ertl, P. Compound Set Enrichment. INTRODUCTION. Introduction.

carmenduell
Download Presentation

Compound Set Enrichment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compound Set Enrichment A novel approach to analysis of primary HTS data Thibault Varin Ansgar Schuffenhauer Gubler, H., Parker, C., Zhang, JH., Raman, P., Ertl, P.

  2. Compound Set Enrichment INTRODUCTION | Compound Set Enrichment | Thibault Varin | 10/07/14

  3. Introduction • Active series identification: Can relevant SAR be extracted from primary HTS data? • Are activity data binary or continuous? | Compound Set Enrichment | ThibaultVarin | 10/07/14

  4. IntroductionActive series identification Hypothesis 1: Within primary HTS screening data, structure activity relationships (SAR) are apparent and can be used to help selecting active compound classes. | Compound Set Enrichment | ThibaultVarin | 10/07/14

  5. IntroductionAre the activity data binary or continuous? Activity Scaffold 1 Scaffold 2 • Binary activity: • 1 active / 5 inactives • Scaffold 1 = Scaffold 2 Continuous activity: Scaffold 1 > Scaffold 2 Active compound (binary) Inactive compound (binary) | Compound Set Enrichment | Thibault Varin | 10/07/14

  6. Introduction Are the activity data binary or continuous? Threshold 1 Activity Threshold 2 Activity Binary scaffold activity is different according to the threshold Hypothesis 2: Methods based on an activity cut-off distort the activity information leading to the incorrect assignment of active series of compounds. Active compound (binary) Inactive compound (binary) | Compound Set Enrichment | Thibault Varin | 10/07/14

  7. Compound Set Enrichment METHODS | Compound Set Enrichment | Thibault Varin | 10/07/14

  8. MethodsThe Scaffold Tree classification The Scaffold Tree – Visualization of the Scaffold Universe by Hierarchical Scaffold Classification A. Schuffenhauer, P. Ertl et al. J. Chem. Inf. Model., 47, 47, 2007 | Compound Set Enrichment | Thibault Varin | 10/07/14

  9. MethodsDatasets • 7 PubChem bioassays • Ranging from 9389 to 263679 compounds • Ranging from 0.03 to 26.29% of active compounds Hypothesis 1 PubChem Annotationfrom CRC Simulation of the primary screening data | Compound Set Enrichment | Thibault Varin | 10/07/14

  10. Methods Single hypothesis test: summary procedure • 1. State the null and the alternative hypotheses • H0: „the scaffold is inactive“ • H1: „the scaffold is active“ • 2. Specify a significance level: α=0.01 • 3. Compute the statistics and the p-value )→p-value=probability that the scaffold is inactive (H0) • 4. Decision step: • p-value> α: H0 is accepted • p-value< α: H0 is rejected and then H1 is accepted„The scaffold is active“ | Compound Set Enrichment | Thibault Varin | 10/07/14

  11. Methods The KS and the Binomial hypothesis tests Bioassay Scaffold H0: there is no difference in the proportion of active compounds for compounds having the scaffold S3-2 and the proportion of active compounds for the full dataset. H0: there is no difference in the activity distribution defined by compounds having the scaffold S3-2 and the background distribution Inactives Actives Continuous data KS test Binary data Binomial test | Compound Set Enrichment | Thibault Varin | 10/07/14

  12. Methods Multiple hypothesis tests: Bonferroni correction • Problem offalse positives • α =probabilitytoidentifyasactive an inactivescaffold (foreachtestdone...) • 100 inactivescaffolds: probabilitytoidentify an „active“ bychanceisequal 63% (1-0.99100)) • Suggests to test each scaffold at a critical significance level equal to α = 0.01 / Nbr of scaffolds • Makes the assumption that the individual tests are independent • Each level in the Scaffold Tree have been done separately | Compound Set Enrichment | Thibault Varin | 10/07/14

  13. MethodsDetermining the activity of classes Hypo 1 Hypo 2 Scaffold activity evaluation Multiple hypothesis test correction (Bonferroni) Comparison of results | Compound Set Enrichment | Thibault Varin | 10/07/14

  14. Compound Set Enrichment RESULTS | Compound Set Enrichment | Thibault Varin | 10/07/14

  15. ResultsComparison of KSP and BTP predictions • With: • KSP: KS Prediction • BTP: Binomial Threshold Prediction • Δ: KSP-BTP • BPCA: Binomial PubChem Annotation Both KSP and BTP retrieve BPCA significantly active classes Most of new KSP active classes are not BPCA significantly actives Number of active classes: KSP > BTP | Compound Set Enrichment | Thibault Varin | 10/07/14

  16. ResultsKSP significantly active scaffolds that are in Pubchem inactives Compound activity (PubChem Annotation) Active Inconclusive Inactive WA Inconclusive? Inconclusives? WA WA WA Inconclusives? | Compound Set Enrichment | Thibault Varin | 10/07/14

  17. ResultsPrioritize nodes instead of individual scaffolds Scaffold activity (KS Prediction / Bonferroni) Non significantly active Significantly active | Compound Set Enrichment | Thibault Varin | 10/07/14

  18. ResultsVisualization tool (Peter Ertl) | Compound Set Enrichment | Thibault Varin | 10/07/14

  19. Compound Set Enrichment CONCLUSION | Compound Set Enrichment | Thibault Varin | 10/07/14

  20. ConclusionCompound Set Enrichment • Validation of initial hypotheses • A method to mine HTS data and identify active series of compounds • Chemical classification: Scaffold Tree • Statistical analysis: Kolmogorov-Smirnov hypothesis test • Multiple hypothesis test correction: Bonferroni correction • Use all primary data • No activity cut-off • Identification of new active scaffolds not necessarily represented by very active compounds (latent hits) during the primary screen | Compound Set Enrichment | Thibault Varin | 10/07/14

  21. With many thanks to Acknowledgments Primary mentor: - Ansgar Schuffenhauer Help: MLI group • Scientific advisers: • Christian Parker • Hanspeter Gubler • Ji-Hu Zhang • Peter Ertl • Edgar Jacoby Fellowship: Education office • Discussions: • Martin Beibel • Sebastian Bergling • Meir Glick • Alain Dietrich • Marie-Cecile Didiot | Compound Set Enrichment | Thibault Varin | 10/07/14

  22. Questions? | Compound Set Enrichment | Thibault Varin | 10/07/14

More Related