1 / 77

In the name of GOD

In the name of GOD. Basic Steps of QSAR/QSPR Investigations M.H. FATEMI Mazandaran University mhfatemi@umz.ac.ir. QSAR. Qualitative Structure-Activity Relationships Can one predict activity (or properties in QSPR) simply on the basis of knowledge of the structure of the molecule?

Download Presentation

In the name of GOD

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. In the name of GOD Basic Steps of QSAR/QSPR Investigations M.H. FATEMI Mazandaran University mhfatemi@umz.ac.ir

  2. QSAR • Qualitative Structure-Activity Relationships • Can one predict activity (or properties in QSPR) simply on the basis of knowledge of the structure of the molecule? • In other, words, if one systematically changes a component, will it have a systematic effect on the activity?

  3. What is QSAR? • A QSAR is a mathematical relationship between a biological activity of a molecular system and its geometric and chemical characteristics. • QSAR attempts to find consistent relationship between biological activity and molecular properties, so that these “rules” can be used to evaluate the activity of new compounds.

  4. Why QSAR? • The number of compounds required for synthesis in order to place 10 different groups in 4 positions of benzene ring is 104 • Solution: synthesize a small number of compounds and from their data derive rules to predict the biological activity of other compounds.

  5. QSXRX=A Activity X=P Property X=R Retention X= bo+ b1D1+ b2D2+…..+ bnDn bi regression coefficient Di descriptors n number of descriptors

  6. History

  7. Early Examples • Hammett (1930s-1940s)

  8. Hammett (cont.) • Now suppose have a related series s reflect sensitivity to substituent r reflect sensitivity to different system

  9. Free-Wilson Analysis • Log 1/C = S ai + m where C=predicted activity, ai= contribution per group, and m=activity of reference

  10. Free-Wilson example activity of analogs Log 1/C = -0.30 [m-F] + 0.21 [m-Cl] + 0.43 [m-Br] + 0.58 [m-I] + 0.45 [m-Me] + 0.34 [p-F] + 0.77 [p-Cl] + 1.02 [p-Br] + 1.43 [p-I] + 1.26 [p-Me] + 7.82 Problems include at least two substituent position necessary and only predict new combinations of the substituents used in the analysis.

  11. Hansch Analysis Log 1/C = a p + b s + c where p(x) = log PRX – log PRH and log P is the water/octanol partition This is also a linear free energy relation

  12. Applications of QSAR • 1-Drug design • 2-Prediction of Chemical toxicity • 3-Prediction of environmental activity • 4-Prediction of molecular properties • 5-Investigation of retention mechanism

  13. Steps in QSPR/QSAR Structure Entry & Molecular Modeling QSAR STEPS Descriptor Generation Construct Model MLRA or CNN Feature Selection Model Validation

  14. Data set selection • 1-Structural similarity of studied molecules • 2-Data collected in the same conditions • 3-Data set would be as large as possible

  15. Steps in QSPR/QSAR Structure Entry & Molecular Modeling QSAR STEPS Descriptor Generation Construct Model MLRA or CNN Feature Selection Model Validation

  16. INTRODUCTION to Molecular Descriptors • Molecular descriptors are numerical values that characterize properties of molecules • Molecular descriptors encoded structural features of molecules as numerical descriptors • Vary in complexity of encoded information and in compute time • Examples: • Physicochemical properties (empirical) • Values from algorithms, such as 2D fingerprints

  17. Classical Classification of Molecular Descriptors Constitutional, Topological 2-D structural formula Geometrical 3-D shape and structure Quantum Chemical Physicochemical Hybrid descriptors

  18. Topological Indexes: Example: • Wiener Index • Counts the number of bonds between pairs of atoms and sums the distances between all pairs • Molecular Connectivity Indexes • Randićbranching index • Defines a “degree” of an atom as the number of adjacent non-hydrogen atoms • Bond connectivity value is the reciprocal of the square root of the product of the degree of the two atoms in the bond. • Branching index is the sum of the bond connectivities over all bonds in the molecule. • Chi indexes – introduces valence values to encode sigma, pi, and lone pair electrons

  19. Electronic descriptors • Electronic interactions have very important roles in controlling of molecular properties. • Electronic descriptors are calculated to encode aspects of the structures that are related to the electrons • Electronic interaction is a function of charge distribution on a molecule

  20. Physicochemical PropertiesUsed in this QSAR • Liquid solubility Sw,L in mg/L and mmol/m3 • Octanol-water partition coefficient Kow • Liquid Vapor Pressure Pv,L in Pa • Henry’s Law constant Hc in Pa∙m3/mole • Boiling point

  21. Steps in QSPR/QSAR Structure Entry & Molecular Modeling QSAR STEPS Descriptor Generation Construct Model MLRA or CNN Model Validation Feature Selection

  22. Feature Selection • E.g. comparing faces first requires the identification of key features. • How do we identify these? • The same applies to molecules.

  23. Objective feature selection • After descriptors have been calculated for each compound, this set must be reduced to a set of descriptors which is as information rich but as small as possible 1- Deleting of constant or near constant descriptors 2- Pair correlation cut-off selection 3- Cluster analysis 4- Principal component analysis 5- K correlation analysis

  24. Variable reduction • Principal Component Analysis

  25. Principal Component • PC1 = a1,1x1 + a1,2x2 + … + a1,nxn • PC2 = a2,1x1 + a2,2x2 + … + a2,nxn • Keep only those components that possess largest variation • PC are orthogonal to each other

  26. Subjective Feature Selection • The aim is to reach optimal model • 1-Search all possible model (Best MLR) • 2-Forward, Backward & Stepwise methods • 3-Genetic algorithm • 4-Mutation and selection uncover models • 5-Cluster significance analysis • 6-Leaps & bounds regression

  27. Feature Selection: • Most existing feature selection algorithms consist of : • Starting point in the feature space • Search procedure • Evaluation function • Criterion of stopping the search ACS

More Related