80 likes | 172 Views
Future CAMD Workloads and their Implications for Computer System Design. IEEE 6th Annual Workshop on Workload Characterization. What is CAMD?. C omputer- A ssisted M olecular D iscovery used in … drug discovery agrochemical discovery (herbicides, insecticides, etc.)
E N D
Future CAMD Workloads and their Implications for Computer System Design IEEE 6th Annual Workshop on Workload Characterization
What is CAMD? • Computer-Assisted Molecular Discovery used in … • drug discovery • agrochemical discovery (herbicides, insecticides, etc.) • “cosmeceutical” discovery • common objectives of all CAMD applications: • find a small molecule (“drug” or “ligand” or “active”) with the right chemical structure for optimal … • interaction with large biomolecule (“receptor” or “target” or “protein”) • ADMET properties (getting & keeping ligand near receptor in body) • decide which compounds (potential drugs) should be synthesized/purchased and tested (“screened”) next? • decide by using computer to first do “virtual screening”
Molecular discovery process Genomics, Proteomics Target ID/Validation & Structure CAMD Assay Development Hits • Bioinformatics Lead Identification Lead Optimization Preclinical/ADMET • Cheminformatics • Modeling & Simulation • Decision Support Clinical Trials Sales & Marketing
Three types of CAMD problems • Intensive computationsonone structure or complex • getting 3D structure of target from genomic information • “protein folding problem” – a classic CAMD problem area • parallel/distributable algorithms exist but best done on a single processor • huge number of possible conformations short cuts taken • refining 3D structure of target from X-ray/NMR data • performing protein-ligand docking & scoring • virtual receptor-ligand complexation virtual screening • flexibility of ligand is currently addressed • flexibility of protein is rarely addressed due to cpu time • scoring functions are crude due to cpu time • faster cpu’s and more memory (for protein folding) would enable better quality results
Three types of CAMD problems • Modest computations on MANY structures • millions of real compounds; billions of “virtual cmpds” • many subtasks associated with virtual screening; e.g.: • convert 2D structure of ligand to 3D (Concord) • generate multiple conformations of each ligand 3D structure • perform various cpu tests to identify which ligands merit further attention using cpu methods (e.g., docking) • crude estimates of ADMET-related properties (e.g., solubility, membrane permeability, etc.) • crude shape-complementarity tests • perform docking (at increasing levels of accuracy) • large input stream ideally suited for distributed proc. • grid-computing using many thousands of nodes (and faster nodes) would enable better quality results
Three types of CAMD problems • Storing data for virtual compounds • millions of real compounds; billions+ of virtual cmpnds • why store data for virtual compounds? • costs time & money to generate & regenerate data • science-related reasons • data generated for one project is often useful for another • must store data for each conformation of each structure • must store data for each structure that a compound can adopt (Optive Research will introduce technology early next year) • new technology will result in HUGE volumes of virtual data • IP-related, competition-related reasons • pharma industry is already planning for offensive and defensive needs in the coming virtual-screening and virtual-IP “wars” • need means to store and access huge volumes of data
Closing comments • practitioners of CAMD are well aware that quality of current methods is limited by compute-resources • rate of discovery and quality of actives discovered would both improve if CAMD methods improved • given that the sales of many actives each exceed $1 billion per year, the market for improved compute-power (and improved CAMD software) is quite substantial • I sure hope that you computer architects can help!! ;-)
Contact Info • for questions about this short presentation, please feel free to contact me at: Dr. Robert S. Pearlman, Pres. & CSO Optive Research, Inc. 512-514-6222 bob.pearlman@optive.com • for questions about Optive Research, Inc. and/or about the Computer-Assisted Molecular Discovery software which we develop, contact me as indicated above or visit our web-site at: www.optive.com