230 likes | 359 Views
Targeting Drug-like properties in Chemical Libraries. David Winkler, Frank Burden, Mitchell Polley Centre for Complexity in Drug Design CSIRO Molecular Science and Chemistry Department, Monash University. VICS. Complexity in Drug Design Group.
E N D
Targeting Drug-like properties in Chemical Libraries David Winkler, Frank Burden, Mitchell Polley Centre for Complexity in Drug Design CSIRO Molecular Science and Chemistry Department, Monash University VICS
Complexity in Drug Design Group • Prof. Frank Burden - Scimetrics Ltd -consultant to CSIRO • Dr. Mitchell Polley - CSS postdoctoral fellow • Darryl Jones - CSS PhD top-up student - Flinders University (Physics) • Prof. Dave Winkler - CSIRO Molecular Science and Monash University/VICS
Overview of Project • Aims to develop a method for evolving a chemical library of heterogeneous agents (molecules) using 'drug-like' fitness functions • Chemical space is vast (>1080 possibilities) • Method must explore drug-like chemical space and identify islands of activity and novelty • Application in the discovery of novel bioactive agents such as drugs, crop care products • Methodology applicable to design of new materials and nanomachines using different fitness functions
Overview of Project Steps… • Devise sparse, informative mathematical representations of molecules • Devise sparse methods of selecting these for models • Use agent-based methods (Bayesian neural nets) to map representations to properties and use models as fitness functions • Develop methods for evolving chemicals using mutation operators so that maximum chemical space can be traversed • Evolve chemical libraries using drug-like fitness functions
Highlights Representations • Novel charge fingerprint descriptor devised and tested • Theory of eigenvalue descriptors cracked • momentum space descriptor work started • Novel selectivity index developed
Sparse Descriptors • Many thousands of descriptors have been devised (e.g. CoMFA fields, DRAGON) • Many are highly correlated with other descriptors - contain the same information • Some (e.g. molecular weight) are information-poor • Models using sparse descriptors can be more predictive • We work to the premise that it is possible to devise sparse, information-rich descriptors from which suitable subsets could be drawn for a wide variety of modelling problems
Charge fingerprints • These are widely applicable, easily computed descriptors calculated by binning charges on atoms in different environments
EEM-based property descriptors • Density Functional Theory (DFT) proposes that knowledge of electron density allows computation of many other properties • Electronegativity equalization methods (Mortier, Bultinck and others) is a rapid, approximate DFT method • All work to date has concentrated on charges or a few other ‘observables’. • Main strength will probably lie with calculation of other molecular properties, when method is generalized and parameterized for more atom types
Why do eigenvalue descriptors work? Eigenvalue matrix EEM matrix AT = TL\ A = TLT' A-1 = TL-1 T' since T'=T-1 for an orthogonal transformation i.e. inverse of A is related to the eigenvalues
Momentum space descriptors • the more interesting part of the electron density distribution in terms of biological activity is located near to the k-space origin. The corresponding r-space density distribution is associated with the outermost valence regions of the molecule • k-space descriptions of electron density are more compact and simpler
Highlights Sparse feature selection • Automatic Relevance Determination (ARD) method refined • Sparse Bayesian feature detection theory mastered • Linear sparse feature detection using an EM algorithm and Jeffrey's prior • Nonlinear Bayesian feature detection achieved but needs more work • Novel variable selection when number of descriptors is much larger than the number of molecules in the data set
Sparse Bayesian variable selection Descriptor
Highlights Optimum nonlinear modelling • Bayesian regularized neural networks working well • Linear sparse feature detection and modelling • Nonlinear Bayesian feature detection and modelling using radial basis function regression • Use of sparse Bayesian methods in neural networks under study
Highlights Models built • Blood-brain barrier partitioning • Drug intestinal absorption • Acute toxicity • Phase II metabolism - substrates and inhibitors (Flinders medical school collaboration) - SVM • Several drug target models - e.g. farnesyl transferase
Blood-brain barrier model Topological descriptors- 3 hidden nodes Training set 85 compounds, test set 21 compounds
Intestinal absorption QSAR model Property-based descriptors- 5 hidden nodes- optimum model
Acute toxicity model Burden index/binned charge descriptors 8 hidden nodes Training set 450 compounds, external test set 53 compounds
COX 1 and 2 QSAR and selectivity • Built QSAR model for cyclooxygenase 1 and 2, and S0 using a large data set from Tom Stockfisch at Accelrys (454 compounds obtained from http://www.accelrys.com/references/datasets/) • Used atomistic (A), Burden eigenvalue (B) and charge fingerprint (C) descriptors together with a Bayesian regularized neural net to build model • Compared MLR with a Bayesian neural net with 3 nodes in the hidden layer
COX 1 and 2 QSAR and selectivity Selectivity of cyclooxygenase 1 and 2 inhibitors
Selectivity Index So QSAR Model MLR R2=0.77 Q2=0.69 BRANN (3 nodes) R2=0.92 Q2=0.74