200 likes | 422 Views
Università degli Studi di Milano Dipartimento di Scienze Farmaceutiche “Pietro Pratesi”. GriDock: An MPI-based software for virtual screening in drug discovery. Alessandro Pedretti. Database of molecules. Set of molecules. Database of molecules. Database filter. Experimental assay.
E N D
Università degli Studi di Milano Dipartimento di Scienze Farmaceutiche “Pietro Pratesi” GriDock: An MPI-based software for virtual screening in drug discovery Alessandro Pedretti
Database of molecules Set of molecules Database of molecules Database filter Experimental assay Database filter Hit compounds Hit compounds Hit compounds Virtual screening Virtual screening High-throughput screening What is the virtual screening ? • The virtual screening (VS) is a computational approach that can be used in drug discovery processes to find new hit compounds. • It can be compared to the High-throughput screening (HTS) that is a true experimental approach.
The database of molecules • The database must contain molecules that are available in the real world or synthetically accessible in easy way. • The pharmaceutical industries have got databases built trough the years from researches in some different fields. • Some databases are publicly available and provided by chemical compound resellers (AKos, Asinex, TimTec, etc) or by non-profit institutions (Kyoto University, NCI, University of Padua, etc). • The database must contain a large number of molecules in order to do an exhaustive exploration of the chemical space.
The database filter • The database filter does the virtual test to check if a molecule could be bioactive or not. • The kind of filter allows to classify the virtual screening approaches in: Ligand-based The 3D structure of the biological target is unknown and a set of geometric rules and/or physical-chemical properties (pharmacophore model) obtained by QSAR studies are used to screen the database. Structure-basedIt involves molecular docking calculations between each molecule to test and the biological target (usually a protein). To evaluate the affinity a scoring function is applied. The 3D structure of the target must be known.
+ Docking software Receptor • The complex quality is evaluated by the score. Ligand – receptor complex Molecular docking Ligand
+ Virtual screening AutoDock 4 VEGA GriDock GriDock – Main features • GriDock is a software developed to perform structure-based virtual screenings. • It’s a front-end to the well known AutoDock software, developed by D.S. Goodsel and A.J. Olson. • It uses VEGA command-line software to perform file format conversion, database extraction and molecular property calculations. • Highly portable C++ code (Linux 32 and 64 bit, Windows 32 and 64 bit). • It can take full advantages of multi-CPUs/cores systems and GRID-based architectures through its parallel design.
Database of molecules • Calculation of the molecular properties. • Input file generation (PDBQT). VEGA Receptor coord. + maps • Molecular docking. • Score calculation. AutoDock 4 Ligand – receptor complexes Score analysis Output files How GriDock works
Database of molecules Hydrogens add Conversion to PDBQT to AutoDock 4 How VEGA works with GriDock Property calculation Potential attribution AMBER force field Gasteiger-Marsili method Calculation of charges Search of flexible torsions
Thread 1 VEGA AutoDock 4 GriDock multi-threaded version Database Receptor GriDock main thread Thread 2 Thread n Symmetric multiprocessing (SMP) provided by pthread library or Windows APIs VEGA VEGA Thread loop AutoDock 4 AutoDock 4 Mutex controlled access Output files* • Log file (gridock_DATE.log). • CSV file containing the list of complexes ranked by docking score. • Zip file containing the output complexes generated by AutoDock 4.
GriDock MPI master node Database Receptor Database Receptor Database Receptor Node 1 Node 2 Node n MPI Node loop VEGA VEGA VEGA AutoDock 4 AutoDock 4 AutoDock 4 GriDock MPI master node Output files GriDock MPI version
GriDock input requirements To perform a virtual screening with GriDock, you need: • The 3D structure of the biological target. • Protein Data Bank (http://www.rcsb.org). • Homology modeling. • The 3D maps of the active site generated by AutoGrid 4 • AutoDockTools / MGLTools (http://mgltools.scripps.edu). • VEGA ZZ (http://www.vegazz.net). • One or more databases of 3D structures in SDF or Zip format. • Ligand.Info: Small-Molecule Meta-Database (http://ligand.info). • MMsINC (http://mms.dsfarm.unipd.it/MMsINC.html). • ZINC(http://zinc.docking.org).
The Citrus tristeza virus case • The Citrus tristeza virus (CTV) is a positive single stranded RNA virus that causes a serious pathology of the citruses. • Any treatment to save the infected plants is unknown. • A possible therapeutic target could be the RNA-dependent-RNA polymerase (RdRp) involved in the virus replication. Infected cell Protease Translation ssRNA (+) – 5’ prot. mRNA Early protein Replicative complex prot. Protease Other proteins Translation RdRp Structural proteins Virions (-)RNA
SwissProt Q2XP15 Primary structure Folding prediction Fugue VEGA ZZ + NAMD To the refinement workflow RdRp model The RdRp model The crystal structure doesn’t exist and a homology modeling procedure was performed: Rough 3D structure
Ramachandran plot Model refinement Rough model VEGA ZZ + NAMD Missing residues Side chains add Hydrogens add 30.000 steps conjugate gradients Energy minimization Structure check Model ready for the screening
RdRp structure Potential attribution Calculation of charges Apolar hydrogens remove PDBQT file Mapping the active site Script file: AutoDock/Receptor.c AutoGrid 4 run Grid map files Calculation of the grid maps AutoDock requires pre-calculated grid maps to evaluate the total interaction energy between the ligand and the target macromolecule. To do it, we used the script included in the VEGA ZZ package:
Considered databases All test databases in SDF format were downloaded from http://ligand.info: • ChemBank • ChemPDB • KEGG Ligand • Anti-HIV NCI • Drug/likeness NCI • Not annotate NCI • AKos GmbH • Asinex Ltd. The total number of docked ligands is: ~1,000,000
40,000 ligands/day. Test system • Tyan Transport VX50 • # 8 AMD Opteron 875 dual core CPUs @ 2.4 GHz. • 8 Gb Ram. • 72 + 150 Gb SATA hard disk. • Linux 64 bit (CentOS 4).
Preliminary results The top ranked ligands contains in their structure one or more sulfurs. Sulfonic acid derivatives. These compounds are know to be potent inhibitors of the HIV reverse transcriptase. Some of them are naphtalen polysulfonic acids developed as Anti-HIV (Anti-HIV NCI database).
Conclusions • We developed a new parallel structure-based virtual screening software able to run on both multi-CPU and GRID systems. • The complete model of the RNA-dependent-RNA-polymerase of Citrus Tristeza Virus was obtained to perform a virtual screening study. • Screening ~1,000,000 ligands, potential RdRp inhibitors were found. • These molecules contains sulfur atoms and, more in details, multiple sulfonic acid moieties. • Some of them are included in the Anti-HIV class. • To complete the study, the activity of the found molecules must be experimentally confirmed by biological assays.
www.vegazz.net www.ddl.unimi.it Acknowledgments • Giulio Vistoli • Cristina Marconi • Alessandro Lombardo • Santo Motta • Francesco Pappalardo • Emilio Mastriani