400 likes | 535 Views
QSAR/QSPR: the Universal Approach to the Prediction of Properties of Chemical Compounds and Materials. V.A.Palyulin, I.I.Baskin, N.S.Zefirov. Department of Chemistry. Moscow State University.
E N D
QSAR/QSPR: the Universal Approach to the Prediction of Properties of Chemical Compounds and Materials V.A.Palyulin, I.I.Baskin, N.S.Zefirov Department of Chemistry Moscow State University
"Every attempt to employ mathematical methods in the study of chemicalquestions must be considered profoundly irrational and contrary to thespirit of chemistry. If mathematical analysis should ever hold aprominent place in chemistry - an aberration which is happily almostimpossible - it would occasion a rapid and widespread degeneration ofthat science." A. Compte, 1798-1857
Fundamental Problem in Chemistry: Evaluation of relationshipsbetweenthe structures of chemical compounds and theirpropertiesor biological activity
QSAR/QSPR: General Approach Model F: A=F(S) Predictivity ΔA Prediction
PROPERTIES Physico-chemical properties: Boiling points, melting points, density, viscosity, surface tension, solubility in various solvents, lipophilicity, magnetic susceptibility, retention indices, dipole moments, enthalpy of formation, etc. Biological activity: IC50, EC50, LD50, MEC, ILS, etc.
DESCRIPTORSTopological indices:Connectivity indices (Randic, c; Kier-Hall, mcv, solvation indices mcs), Wiener W and expanded Wiener, Balaban J, Gutman indices, Hosoya, Merrifield-Simmons indices, indices based on local invariants, informational indices, …Fragmental descriptors:The number of fragments of various size (chains, cycles, branched fragments) in a molecule with several levels of classification of atoms Physico-chemical descriptors:Indices based on atomic charges and electronegativities, atomic inductive constants, VdW volume and surface, H-bond descriptors, Lipophilicity (Log P), … Quantum-mechanical3D Usp.Khim. (Russ.Chem.Rev.), 57 (3), 337-366 (1988)
RandićIndex(c) ¾Ø c = S 1/Övivj bonds c= 1/(3)1/2+1/(3)1/2+1/(6)1/2+1/(2)1/2=2.27
Prediction of Non-Specific Solvation Enthalpy of Organic Compounds Solvation enthalpy (kJ/mol) Vaporization enthalpy (kJ/mol) n = 141 R = 0.985 s = 2.1 n = 528 R = 0.989 s = 2.0 μ – dipole moment 1χS– 1-st order solvation topological index Zi – period number (measure of atom size) δi – number of non-hydrogen neighbors Dokl. Akad. Nauk, 1993, 331(2), 173-176
The scheme of the design of new topological indices (TIs) a Construction of graph matrices and their storage Selection of functions Selection of fragments Construction of topological indices a) Using matrices b) Using already constructed TIs The set of constructed TIs for QSAR/QSPR studies
Prediction of Diffusion of Small Molecules in Polymers log D exp. D – diffusion coefficient (cm2/s) Nat – number of non-hydrogen atoms min ρHOMO – minimal HOMO π-electron density – extended and inverted extended Wiener indices log D pred. n = 14 R = 0.989 s = 0.103 F = 145 Dokl. Akad. nauk. 1994 337 (2) 211-214
Sulfenamide Vulcanization Accelerators N – number of non-hydrogen atoms – maximum carbon LUMO π-electron density Sm – molecular electronegativity – indices based on atomic induction effect parameters Resistance to preliminary vulcanization (min) n = 12 R = 0.989 s = 0.004 F = 444 Vulcanization rate constant (min-1) n = 12 R = 0.990 s = 0.15 F = 213 Maximum torque increase (Nm) n = 12 R = 0.989 s = 0.054 F = 134 Dokl. Akad. nauk. 1993 333(2) 189-192
Prediction of Mutagenicity of Substituted Biphenyls ln (Nhis+) pred. ln (Nhis+) pred. ln (Nhis+) exp. ln (Nhis+) exp. n = 19 R = 0.95 s = 0.69 F = 35 n = 19 R = 0.94 s = 0.75 F = 39.3 Nhis+ – number of revertants Fr1-3 – number of fragments d1 – minimum squared C-atom LUMO contribution d2 – minimum squared N-atom LUMO contribution d3 – maximum C-atom free valence index d4 – average O-atom free valence index Fr1 Fr2 Fr3 Dokl. Akad. nauk. 1993 332(5) 587-589
Fragmental Descriptors The numbers of fragments of various kind and various size (chains, cycles, branched fragments) in a molecule with several levels of classification of atoms. For each molecule hundreds of fragmental descriptors can be computed. If a structure-property data set is sufficiently large to allow building statistically significant models, then any topological index can be replaced with a set of substructural (or fragmental) descriptors.
Boiling point [1] (diverse set of 885 compounds) fragment types p1, p2, p3, p4, p5, p6, c3, c4, c5, c6, s4, s5, s6
Architecture of the Neural Device for Direct QSAR BRAIN EYE 1 ("looks" at atoms) EYE 2 ("looks" at bonds) (2,1) (2,3) (3,2) (1,2) (1) (2) (3) CH3 SENSOR FIELD (each sensor detects the number of the attached hydrogen atoms) 1 CH2 2 CH3 3 1 2 3 Baskin, I. I.; Palyulin, V. A.; Zefirov, N. S., J. Chem. Inf. Comput. Sci., 37, 715 (1997) Neural device in application to the propane molecule :
EXAMPLES OF THE DIRECT STRUCTURE-PROPERTY CORRELATIONS Baskin, I. I.; Palyulin, V. A.; Zefirov, N. S., J. Chem. Inf. Comput. Sci., 37, 715 (1997)
New approach in QSAR: Neural Quantitative Structure-Conditions-Property Relationships R – correlation coefficient; StandSv– RMSE for the training and validation sets
Construction of Descriptor Matrix Molecular Field Topology Analysis (MFTA) Construction of Molecular Supergraph Local descriptors: - Electrostatic - Steric - Lipophilic - Hydrogen bonding - Stereochemical - Topological Model building Generation of novel promising structures Palyulin, V. A.; Radchenko, E. V.; Zefirov, N. S., J. Chem. Inf. Comput. Sci., 40, 659 (2000)
Local Descriptors • Sufficient coverage of major interaction types • Easy extension of the descriptor set • Electrostatic • Gasteiger's atomic charge Q (electronegativity equalization) • Absolute atomic charge Qa = abs(Q) • Sanderson's electronegativity • Electrotopological state ETS (Hall, Mohney, Kier) • Steric • Bondi's van der Waals radius R • Atomic contribution to the molecular van der Waals surface S • Relative steric accessibility A=S/Sfree • Lipophilic • Atomic lipophilicity contribution La (environment-dependent - Ghose, Crippen) • Group lipophilicity Lg (atom and attached hydrogens) • Hydrogen bonding • Hydrogen bond donor (Hd) and acceptor (Ha) ability of an atom (Abraham) • Stereochemical • Local stereochemical indicator variables • Topological • Site occupancy factors for atoms Pa and bonds Pb (1 if a feature is present)
Affinity of substituted 2,5-diazabicyclo[2.2.1]heptanes to nicotinic acetylcholine receptor Training set: 31 compounds R1 = H, Me, CH2CN R2 = R = H, Me, F, Cl, Br, OH, NH2, OMe, CN, CH2NH2, CONH2, NO2, PhCOO
Affinity of substituted 2,5-diazabicyclo[2.2.1]heptanes to nicotinic acetylcholine receptor Ki – inhibition of competitive binding MED – minimum effective dose (hot plate test) Predicted lg(1/Ki) Experimental
Affinity of substituted 2,5-diazabicyclo[2.2.1]heptanes to nicotinic acetylcholine receptor Ki – inhibition of competitive binding R Q Lg Ha
Affinity of substituted 2,5-diazabicyclo[2.2.1]heptanes to nicotinic acetylcholine receptor Construction of novel potentially active structures Total generated structures: 171 5 best structures wrt lg(1/Ki) R1 = Me, Et, CN, Pr, i-Pr, t-Bu, Ph, 4.01 3.69 где R = CH3, Cl, Br, NO2 R2 = Me, Et, Pr, CN, i-Pr, t-Bu 3.44 3.66 3.69 Activity range in training set -3.41 ... 2.05
Bradycardic activity of 3,7,9,9-tetraalkyl- 3,7-diazabicyclo[3.3.1]nonanes Training set: 26 compounds R1, R2 = Me, Pr, i-Pr, Bu, i-Bu, C5H11, C6H13, C10H21, CH2-c-Pr, CH2-c-C6H11, CH=CH2, CH2CH2CH=CH2 R3, R4 = Me, Et, Pr, Bu, -(CH2)3-, -(CH2)4-, -(CH2)5-
Bradicardic activity of 3,7,9,9-tetraalkyl-3,7-diazabicyclo[3.3.1]nonanes SR75 – ability to decrease pacemaker pulse frequency (target effect) F75 – ability to decrease myocardium contraction force (side effect) SelF– selectivity wrtF FRP75 – ability to increase refractory period (side effect) SelFRP– selectivity wrtFRP
Bradicardic activity of 3,7,9,9-tetraalkyl-3,7-diazabicyclo[3.3.1]nonanes SR75 – ability to decrease pacemaker pulse frequency (target effect) Predicted Q R Experimental
Bradicardic activity of 3,7,9,9-tetraalkyl-3,7-diazabicyclo[3.3.1]nonanes SelF– selectivity of antiarrhythmic activity wrtmyocardium contraction force Predicted Q R Experimental Ha
Bradicardic activity of 3,7,9,9-tetraalkyl-3,7-diazabicyclo[3.3.1]nonanes Construction of novel potentially active structures Total generated structures: 105 5 best structures wrt SelF R1, R3 = Me, Et, Pr, i-Pr, t-Bu, R2 = Me, Et, Pr, i-Pr, t-Bu 70.75 70.74 63.83 63.82 63.12 Activity range in training set 0.4 ... 177
Conclusions QSAR/QSPR (Quantitative structure-activity/property relationships) approaches can be considered as universal techniques for the modeling and prediction of nearly any properties of chemical compounds and many properties of materials. Some properties of materials can be predicted as dependent on the structure of small molecules used as additives (e.g. antioxidants, etc.). A number of properties of polymers had been modelled as dependent of the chemical structure of monomeric unit (e.g. glass transition temperature, molar heat capacity for liquid and solid state, dielectric constant, refraction index).
The group of molecular design Academician N. S. Zefirov – Head of Organic Chemistry Division Dr. V.A. Palyulin – Head of Group Dr. I.I. Baskin Dr. A.A.Oliferenko Dr. E.V.Radchenko Dr. M.I.Skvortsova Dr. I.G.Tikhonova Dr. M.S.Belenikin Dr. A.A.Ivanov Dr. A.Yu.Zotov S.A.Pisarev A.A.Ivanova A.A.Melnikov