60 likes | 73 Views
Investigating the HIV RNase H bioassay data for new drug hypothesis generation through machine learning techniques, with a focus on reducing noise and exploring knowledge in an effort to combat drug-resistant mutations and toxic side effects in HIV treatment.
E N D
Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch NCBI/NLM/NIH Rocky ‘07 1
Introduction • The need for new anti-HIV agents • Drug resistant mutations • Side effect / Toxicity • The limit in virtual screening techniques • Huge chemical space • Structure and activities • The challenge to generate new hypothesis • Noise reduction • Knowledge exploration 2
HIV-1 reverse transcriptase associated ribonuclease H assay HIV-1 RT-RNase H assay • Designed by Dr. Michael Parniak of the University of Pittsburgh • PubChem, AID 565 • 65218 compounds tested, 1250 of them are actives • Distributions of all compounds tested in The HIV-1 RT-RNase H assay inactives actives Associations among actives and inactives (Tanimoto ≥ 0.95) 3
1 1 … … 0 Summation Layer Fingerprint processing Output Layer New Compounds Hidden Layer A learning machine • PubChem fingerprint: Numerical understanding of molecular structures 2-Methyl pentane (1,1,…0) • Probabilistic Neural Network : Machine learning 4
Model evaluation • 10 fold Cross validation • Sensitivity 86.4% • Specificity 92.0% • Matthews correlation coefficient 0.26 • Receiver Operating Characteristic (ROC) curve analysis • Area Under Curve (AUC) : 0.90 5
Conclusions • The bioactivity data of HIV-1 RT-RNH assay can be learned for new hypothesis • The machine learning of HTS data can be used for virtual hits exploration Acknowledgements • Yanli Wang • Steve Bryant • This research was supported by the Intramural Research Program of the NIH/NLM 6