410 likes | 1.06k Views
Molecular Modeling: Conformational Molecular Field Analysis (CoMFA). C372 Dr. Kelsey Forsythe. CoMFA. Cramer and Milne (1979) Comparison of molecules by alignment and field generation Wold (1986)
E N D
Molecular Modeling: Conformational Molecular Field Analysis (CoMFA) C372 Dr. Kelsey Forsythe
CoMFA • Cramer and Milne (1979) • Comparison of molecules by alignment and field generation • Wold (1986) • Proposes using PLS instead of PCA for overrepresented (1000’s of field non-orthogonal “variables”) problem (correlate field values with activities) • Cramer, Patterson and Bunce (1988) • Introduced CoMFA
CoMFAAssumptions • Activity is directly related to structural properties of system • Structural properties determined by non-bonding forces
Outline of CoMFA • Hypothesize mechanism for binding • Structure of binding site • Most important/difficult • Find equilibrium geometry • Construct lattice or grid of points • Compute interaction of probe with molecule at each point • Apply PLS • Predict
CoMFA Structural Focus • Hypothesize mechanism for binding • Structure of active site and/or common pharmacophore between all compounds • Most important/difficult • Structural errors propagate to later stages • Superpose structures • SEAL • Similarity index
CoMFA Structural Focus Poor alignment Better alignment
CoMFAEquilibrium Geometry • Find equilibrium geometry • Ab Initio, Semi-Empirical or Molecular Mechanics • Method depends • Size • Accuracy
CoMFALattice Construction • Construct lattice or grid of points for field analysis Steroid (1 representative conformer shown) 14 x 11 x 7 = 1078 points
CoMFAField Data Generation • Compute interaction of probe with molecule at each point • Interaction is typically non-covalent (e.g. non-bonding forces) • Steric, electrostatic and hydrophobic • Probe depends on interaction • Kim et. al. • H+ (electrostatic) • CH3 (steric) • H2O (hydrophobic)
CoMFAField Data Generation • Compute interaction of probe with molecule at each point • Ncalc=Ngrid * Ncmpds* Nprobes
Outline of CoMFA • Apply PLS • Problem overrepresented in field variables/descriptors • Sieve most important field components (PCA) • Use in regression
QSAR/QSPR-Regression Types • Partial Least Squares • Cross-validation determines number of descriptors/components to use • Derive equation • Use bootstrapping and t-test to test coefficients in QSAR regression
QSAR/QSPR-Regression Types • Partial Least Squares (a.k.a. Projection to Latent Structures) • Regression of a Regression • Provides insight into variation in x’s(bi,j’s as in PCA) AND y’s (ai’s) • The ti’s are orthogonal • M= (# of field points OR molecules whichever smaller)
QSAR/QSPR-Regression Types • PLS is NOT MR or PCR in practice • PLS is MR w/cross-validation • PLS Faster • couples the target representation (QSAR generation) and component generation while PCA and PCR are separate • PLS well applied to multi-variate problems
CoMFA PLS Regression • Sij field value for jth probe at ith grid point • cij regression weight for Sij
3-D QSAR (CoMFA)Post-Qualifications • Confidence in Regression • TSS-Total Sum of Squares • ESS-Explained Sum of Squares • RSS-Residual Sum of Squares
3-D QSAR (CoMFA)Post-Qualifications • Cross-validation • Bootstrapping • Reassign ‘wrong’ activity
3-D QSAR (CoMFA)Post-Qualifications • Standard Deviation in Error Prediction • N - Number of observations • No penalty for exclusions/inclusion of latent variables
3-D QSAR (CoMFA)Post-Qualifications • Standard Deviation in Predictions • PRESS (Predictive Error Sum of Squares) • N - Number of observations • c - Number of latent variables used in regression • Want ‘c’ s.t. (c + 1 results in 5% decrease in sPRESS)
3-D QSAR (CoMFA)Post-Qualification • Randomly re-assign activities to compounds • Compare predictability of ‘wrong’ regressions with true regression • Determine random correlation • Determine efficacy of ‘true’ regression
3-D QSAR (CoMFA)Dependencies • Active compounds in data set • Grid size • Energy model • Probe groups (# and type)
ApplicationNilsson, J. , De Jong, S. Smilde, A. K. Multiway Calibration in 3D QSAR. J of Chemometrics1997, 11, 511-524. • Multilinear PLS applied to group of benzamides interacting with dopamine D3 receptor subtype (anti-schizophrenia drugs)
Application • 30 aligned set of benzamides and napthamides • Regions indicate principal components
Application Field Generation • 5 Modes • Molecular (1) • 30 molecules • Field (3) • X, Y and Z • Probes (1) • Steric ( C ) • Hydrophobic (H2O) • Electrostatic (H+)
ApplicationPre-Qualifications • Scaling (Not Applied Here) • Unit Variance (Auto Scaling) • Ensures equal statistical weights (initially) • Mean Centering
ApplicationPrincipal Components • First 4 PCs in space of original descriptors
ApplicationRegression • X - Principal Components • B - Regression coefficients
ApplicationSteric Plot • Y=x1b1+…xibi • Guide placement of substituents on novel compounds depending on the value of Y (log(Ki)) desired
ApplicationValidation • Cross Validation • Leave-One-Out • External Predictions • Test Set • 21 compounds
ApplicationValidation • Cross Validation • Leave-One-Out (ypred from 29) • External Predictions • Test Set (ypred from regression)
3-D QSAR (CoMFA)Potent Pitfalls • Sensitivity to binding structure • Hydrophobicity not well-quantified • Sensitivity to Nlatent • Relation between latent variables NOT intuitive • Test compounds should not differ significantly in properties from training set • Low S/N (too many useless field variables)
CoMFAAssumptions • Activity is directly related to structural properties of system • Dynamical corrections? • Structural properties determined by non-bonding forces • Covalent • Hydrophobic
Advanced CoMFA • SRD (Smart Region Definition) • LOCAL Set of variables/grid values will display similar behavior due to structural changes • Reduce M-grid points to one focal point or seed • Use “distance” cutoff (nearest, next nearest etc.) to define reduced set of field points • Reduced PLS • Use only high weight PCs in regression
Other QSAR-based Methods • HQSAR • Convert 3D --> 2D string • Generate random collections of string elements • CoMSIA (Conformational Molecular Similarity Indices Analysis • Wprobe,k=+1(charge),+1(hydrophobicity),1A,+1(h-bond acceptor),+1(h-bond donor)
References • Cramer III, R. D., Patterson, D. E., Bunce, J. D. Comparative Molecular Field Analysis (CoMFA). 1. Effect of Shape on Binding of Steroids to Carrier Proteins. J. Am. Chem. Soc. 1988, 110, 5959-5967. • Hansch, C. and Leo, A. Exploring QSAR: Fundamentals and Applications in Chemistry and Biology American Chemical Society (1995) • Leach, Andrew R. Molecular Modelling: Principles and Applications Prentice Hall, New York (2001)
Additional Resources • The QSAR and Modelling Society (http://www.pharma.ethz.ch/qsar) • Quantitative Structure Activity Relationships (Journal)
Additional Resources • SYBYL-Molecular Modeling Software, 6.9, Tripos Incorporated, 1699 S. Hanley Rd. St. Louis, Mo. 63144, USA • GRID, Goodford, P. J. Molecular Discovery Ltd, University of Oxford, England