TCOF 3 :Repositioning of Chemical compounds From Different Classes as part of Virtual Screening

Under the Guidance of PI: Dr UCA JALEEL (IISc Research Unit, Bangalore) Swati Gandhi [Shah] 3.2 TCOF Fellow (MSc Bioinformatics, The Maharaja Sayajirao University of Baroda) Blog Url: swatigandhishah.wordpress.com TCOF 3 :Repositioning of Chemical compounds From Different Classes as part of Virtual Screening

The aim of this project is to develop classes of anti MTb compounds and reposition them by screening pesticides which are found active against TB which we can further proceed with clinical trials. Repositioning of Chemical compound database divided under three sub classes:- 1)Pesticides 2)Antimicrobial molecules 3)Phytomolecules -> Me and My Group worked on Pesticides showing anti TB activity: • In search of Pesticide database we started with many search engine like Pubchem,PAN (Pesticide Action Network) pesticide database,Eu Pesticide Database & finally our search comes to an end with EPA (Environmental Protection Agency). • In EPA we got some 654 pesticide molecules out of which we have structure and SDF file for 487 molecules remaining structure is drawn by (Ayisha safeeda) with the help of “MARVIN” and saved in SDF file format. • Next Slide will give an over view of the Project in the form of Flowchart to Explain the process. We are suppose to follow repositioning of Chemical compound database which again divided under three sub classes:- 1)Pesticides 2)Antimicrobial molecules 3)Phytomolecules Our group is Targeting on Pesticides database

As per the Flow Chart of the Previous Slide We have Initialized the WEKA Part; The algorithms are applied directly to a dataset; Training Set and Test Set generated on a ratio of 80:20 WEKA Model is generated with the Help of this Training and Test Set and Next Slide Defines the Step.

Module – Work Flow Accessing the HTS bioassay data PubChem PowerMV PowerMV All compounds sdf file Upload the sdf file Generate descriptor file Open the CSV file in Excel Append the bioassay result corresponding to the compounds Excel Bioassay result (all) Select the actives and inactive compounds Remove the useless attributes TP %, FP<20%, Accuracy >70% Apply classifier algorithms File splitting Training WEKA (machine learning) Selection of best classifier model Testing

Current Stage of Project is Tuning of Model Generated by WEKA: We are trying to Tune the Model to the Most Stable state Applying the Cost Matrix on it . We have generated the Results using different Classifiers like Naïve bayes and Random Forest We are trying to Tune the Model giving the Cost Matrix to it. Next Slide will draw some light on this. Next Stage is to Go for Screening and then We will proceed Further

Sheet Defining the Results After Applying the Cost Matrix

References: 1) Schierz AC. Virtual screening of bioassay data. J Cheminform. 2009 Dec 22;1:21. doi: 10.1186/1758-2946-1-21. PubMed PMID: 20150999. 2) Periwal V etal., Predictive models for anti-tubercular molecules using machine learning on high-throughput biological screening datasets. BMC Res Notes. 2011 Nov18;4:504. doi: 10.1186/1756-0500-4-504. PubMed PMID: 22099929. 3) Ekins S, etal., Combining Computational Methods for Hit to Lead Optimization in Mycobacterium Tuberculosis Drug Discovery. Pharm Res. 2013 Oct 17. [Epub ahead of print] PubMed PMID: 24132686. 4) Enviornmental Protection Agency

Heartiest Thanks and Acknowledgement: 1) Prof. Dr. Samir Bramachari2)Dr Jaleel (PI TCOF3)3) Dr Bheemarao Ugarkar4)Dr TS Balganesh5)OSDD Team6)IISc Research Unit, Bangalore7) Group Members [Yatindra Yadav and Ayisha Safeeda]

TCOF 3 :Repositioning of Chemical compounds From Different Classes as part of Virtual Screening