400 likes | 489 Views
A Research Council’s Basic Technology Research Programme. A new paradigm for virtual screening. Background. Cross research council endeavour administered by EPSRC Funding for research to create a new technology Change the way we do science Underpin the future industrial base.
E N D
A Research Council’s Basic Technology Research Programme A new paradigm for virtual screening
Background • Cross research council endeavour • administered by EPSRC • Funding for research to create a new technology • Change the way we do science • Underpin the future industrial base
Atom based modellingQSAR & QSPR • Almost all modelling techniques are based on atomistic descriptions of molecules • Although these techniques have been successful over several decades, they have disadvantages • poor scaling characteristics • lack of a solid physical justification, e.g. scoring functions • interpretation difficult due to abstract nature of many descriptors • tendency to produce high dimensional models
Improved molecular modelling? • Can we define a more parsimonious and explicit description of molecules than has so far been achieved using atomistic models? • leading to better prediction AND a clearer understanding of the properties of molecules and how they arise
A non-atom based approach • We are developing an alternative approach in which molecules are described by their surfaces Benzodiazepine analogues
A non-atom based approach • The approach is based on calculation of a set of local properties at or near the molecular surface • the local molecular electrostatic potential (MEP) • the local ionisation energy (LIE, IEL) • the local electron affinity (LEA, EAL) • the local polarisability (LP, L)
Calculation of thesurface properties • Molecules defined as isodensity surfaces • using semi-empirical AM1 electron density • can also be defined using a shrink-wrap or a marching cube algorithm • Fitted to a spherical harmonic expansion • the shape of the shrink-wrapped surface, or • the four local properties • MEP, LIE, LEA & LP
Describing surface shape:spherical harmonic expansion • The accuracy of the surface description is a function of the order N of the expansion • The greater N, the larger the computational penalty
Advantages of this approach • This gives a completely analytical description of the molecule’s shape & the 4 local properties • intermolecular binding properties & chemical reactivity • Spherical harmonics can be truncated at low orders for fast QSAR scans (HTS), fast superposition of molecules & rapid calculation of similarity indices • for ligands (MW < 750), N = 6-8 • for peptides & proteins (MW > 5,000), N = 25-30
Putative resolutions for in silico screening • For ligands N=6 • For receptors N=25
Application to QSAR & QSPR • Several classes of QSAR/QSPR descriptors can be derived from the local properties, including: • the spherical harmonics coefficients for constant order N • the number of coefficients is invariant of the number of atoms in a molecule • the critical points for each surface property • maxima, minima & saddle points • the distribution of field intensities at the molecular surface • four fields with local intensities varying between molecules • sample using grid points? • the surface integrals for each field
Public domain datasets Small Consensus Set of 74 Drug Molecules (diverse) QSAR set (31 CoMFA steroids) Medium WDI subset (2,400 compounds) Harvard Chembank dataset (2,000 compounds) Large WDI (50,000) Maybridge (50,000)
An example grid of surface points A grid is placed on this molecular surface in order to reduce the number of surface points from 4038 to 55
Gradient flows & molecular surface property graphs • Characterize the behaviour of a property f : S on amolecular surface S, in terms of a directed graph G on S derived from the gradient vector field x=grad f(x) • The molecular surface property graph Gis defined by • Vertices (G) =fixed points of grad f = critical points of f • Edges (G) = stable and unstable manifolds of the saddle points
Example Molecule Allopurinol
Allopurinol RGB Surfaces LIE encoded on Red channel LEA encoded on Green Channel LP or MEP encoded on Blue Channel
Critical points of allopurinol 8 maxima 7 minima 13 saddles No. of maxima – no. of saddles + no. of minima = Euler characteristic (S) = 2
Distribution based descriptors 34 descriptors were measured including maximum field intensity minimum field intensity mean field intensity range of field intensities variance of field intensities The Principal Components of the descriptors were calculated to provide a set of orthogonal descriptors derived from the local properties at the molecular surface
Other distribution based descriptors Moments 1st – Mean 2nd – Variance 3rd – Skewness 4th – Kurtosis > 4th – Higher moments as required Overlapping Gaussians Kernal density procedure
LIE LEA LP MEP LIE 1 0.44 0.26 0.39 LEA 0.44 1 0.58 0.47 LP 0.26 0.58 1 -0.1 MEP 0.39 0.47 -0.1 1 Correlation Matrix for properties of allopurinol
Physical-Property Mapping • Maybridge used as the “chemistry“ dataset • Use the top six principal components to train a 100 100 Kohonen net (unsupervised training) • 2,105 compounds selected from the World Drug Index as real drugs used as the drug dataset
“Drugs“ “Drugs“ Physical Property Map Train Kohonen Net “chemistry“
Surface-integral models • P= target property • Ai = area of triangle i • ntri = number of triangles
Free energies & enthalpies of hydration, free energies of solvation for n-octanol & chloroform
Surface comparison Two different approaches: • Using spherical harmonic molecular surfaces [J. Comp. Chem. 20(4) 383-395; Ritchie and Kemp 2000; University of Aberdeen]. • Partial molecular alignment via local structure analysis [J. Chem. Inf. Comput. Sci. 40(2) 503-512 ; Robinson, Lyne and Richards 1999; University of Oxford].
Voting pairs provide possible local alignments Try all possible voting pairs to produce a large number of alignments. The choice of voting pairs can have a critical effect on the quality of the surface alignment.
Example alignments 4 3 2 1
ParaSurf v1.0 Surfaces Isodensity Surfaces Shrink Wrap Marching Cube Surfaces fit to Spherical Harmonics Properties MEP, LIE, LEA and LP Encoded at points on the surface Encoded as Spherical Harmonic Expansions
GRID Computing ParaSurf compiled on SGI IRIX Windows Linux (SUSE) IBM AIX Future Platforms SUN Solaris GRID enabling at Portsmouth, Southampton and Oxford.
Critical features Portsmouth Pattern matching on surfaces Southampton/Oxford Molecular surfaces Aberdeen QM properties on surface Erlangen Data reduction and QSAR Portsmouth Summary Compound screening Spherical harmonic representation Aberdeen
Conclusions • Properties can be calculated at the surface of molecules • These properties can be RGB encoded • The properties are local • Descriptor sets derived from these properties can be used for robust QSPR & QSAR models • The algorithms will soon be available commercially for use in virtual high throughput screening
ParaSurf – in silico Screening Technology • Basic Technology Funding for October 2003 to September 2004 • Proof of concept studies • Consortia building networking • Academic partners • University of Portsmouth • University of Erlangen • University of Southampton • University of Aberdeen • University of Oxford