340 likes | 443 Views
CZ3253: Computer Aided Drug design Lecture 6: QSAR part II Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg http://xin.cz3.nus.edu.sg Room 07-24, level 7, SOC1, National University of Singapore.
E N D
CZ3253: Computer Aided Drug designLecture 6: QSAR part II Prof. Chen Yu ZongTel: 6874-6877Email: csccyz@nus.edu.sghttp://xin.cz3.nus.edu.sgRoom 07-24, level 7, SOC1, National University of Singapore
Examples of QSAR Applications:Application of in silico technology to screen out potentially toxic compounds using expert and QSAR models
Commercial Software • Commercially available toxicity estimation packages are available to predict a variety of toxic endpoints including mutagenicity, carcinogenicity, teratogenicity, skin and eye irritation and acute toxicity: • DEREK (Deductive Estimation of Risk from Existing Knowledge)- www.chem.leeds.ac.uk/luk • HazardExpert – www.compudrug.com/hazard • CASE (Computer Automated Structure Evaluation) – www.multicase.com • TOPKAT (Toxicity Prediction by Computer Assisted Technology) – www.accelrys.com/products/topkat • OncoLogic – www.logichem.com
Pharma Algorithms Providers of Databases, Predictors and Development Tools N Log P 10,000 DMSO Solubility 22,000 p Ka 8,000 Stability at pH < 2 20,000 Aqueous Solubility 5,500 Permeability (HIA) 1,000 Active Transport 500 Pgp Transport 1,000 Oral Bioavailability (Human) 900 LD50 Intraperitoneal 36,000 ... ...
Pharma Algorithms Development Tools Algorithm Builder development platform: • Data storage and manipulation • Generation of fragmental descriptors • Statistical procedures: MLR, PLS, PCA, Recursive Partitioning, HCA • Tools for predictive algorithm development
Y F F F F ... 1 2 3 M ... ... ... ... ... Structure ... 1 Structure ... ... ... ... ... ... 2 ... ... ... ... ... ... ... ... ... ... ... ... ... Structure N Generation of Descriptors
“Causal” Descriptors Activity effects Atom chains Examples H O One-atom O H O Non-specific ("topological") (size, PSA) N H O COOH, CONH Three-atom O H O Ionization, H-bonding N O Cl O H O Specificity Fragment Size H O O Five-atom Reactivity, internal N interactions H O O Larger chains, H O O Ring scaffolds N N H O N Similarity to natural compounds
Algorithm Development • Graphical Interface provides easy to use tools for programming complex algorithms • Combine fragmental, descriptor and similarity based methods • Use logical expressions, conditions and equations based on descriptors, sub-fragments, internal interactions or any other chemical criteria • Combine multiple sub-algorithms into general algorithms • Rapidly develop ‘custom’ filters incorporating • ‘expert’ in-house or project specific rules
Our focus Tox Effects in Drug Design Tox Effect Acute (LD50) Organ-specific effects Mutagenicity Reproductive effects Carcinogenicity Programs Topkat, AB/LD50 AB/Tox* (next version) Many programs, AB/Tox Many programs, AB/Tox* Many programs, AB/Tox*
Will consider these Existing Programs ADME LD50 Other Descriptors QSAR QickProp TopKat Mixed DEREK HAZARD Expert COMPACT “Manually” derived skeletons C-SAR M-CASE “Statistical” skeletons META Combined AB/Oral %F AB/LD50 AB/Tox Combinations of above
What Is LD50 A dose that kills 50% of animals during 24 hrs In drug design, used at pre-clinical stage In early stages, replaced with “reductionist” considerations Some scientists question its utility
Empirical knowledge + simulations “Reactivity + log P ” Empirical knowledge Informatics Toxicologists PK Specialists Complexity of LD50
Is this good enough? Acute Tox in Drug Design Lead Selection No tests performed Reactive groups discarded Lead Optimization Basal cytotoxicity tested Intra-cellular effects considered Pre-clinical Stage Animal tests are required ADME effects considered
Acute Tox in Drug Design An LD50 Model for mouse (intraperitoneal administration) was developed using data from the RTECS database (35,000 compounds)
Extra-cellular effects - may be “invisible” in cytotoxic assays Distribution of Acute Effects RTECS DB: mouse, intraperitoneal administration All compounds (N ~ 35,000) LD50 < 50 mg/kg (N = 4,099)
In Vivovs. In Vitro IC50 cannot model LD50 when extra-cellular effects occur
How to Predict These Effects? LD50 involves much more than “log P + reactivity” “Reductionist” QSARs do not work Quality of Predictions = Knowledge of Specific Effects How much knowledge do we get?
How Much Knowledge? QSAR Model Knowledge Log 1/LD50 = aixi Expert Deduction Little Knowledge Active Inactive More Knowledge C-SAR + Deduction Active Inactive Active Inactive Struct. Space
The most significant skeletons are “potential toxicophores” C-SAR + Deduction LD50 values are split into groups using fragmental descriptors from AB
Specific Effects in AB/LD50 > 33,000 Compounds with LD50 from RTECS DB
Low-Specific Effects Arrows denote increasing toxicity Small non-bases are least toxic. Hydrophobic amines are most toxic
C-SAR + Deduction To get new knowledge, statistics must help deduction. To use QSAR models, they must work in narrow structural spaces. Efficacy Comparison Knowledge Expert Deduction Effort QSAR Model Struct. Diversity
QSAR Models in AB/LD50 1. Narrow struct. spaces 2. Dynamic fragmentation 3. “Causal” parameters
What is novel? • The novel features of the Pharma Algorithms approach are: • Combination of approaches used separately in earlier software i.e. Expert Rues (e.g. DEREK), C-SAR (e.g. CASE) and QSAR (e.g. TOPKAT) • Reliable Confidence Intervals are generated from QSAR models (class specific and global) that are derived using an automated multi-step process: • Chain fragmentation and PLS with multiple bootstrapping • Selection of best fragments with ‘stable’ increments • Derivation of multiple models from subsets of the training set to produce ranges of predictions • Selection of the best model to use for a particular compound by comprison of the different ranges • Calculation of the confidence interval from the range of predictions produced by the most appropriate model
Screening the Specs DB SPECS are a supplier of diverse compound screening collections A set (N = 14,902) was randomly selected (from > 200,000) and screened using the AB/LD50 toxicity predictor. Calculation of LD50 for the set takes about 30min on a standard Windows laptop Compounds were deemed “Toxic” if LD50 < 50 mg/kg Results: Overall only 2.7% were “toxic” (i.e. 310 of 14,902) As expected a higher proportion (3.9%) of the bases (i.e alkylamines) were toxic (i.e. 92 of 2,351)
Most significant Toxic Skeletons
What We Have Learned So Far Screening for basal cytotoxicity is not enough The “C-SAR + Deductive” method opens new possibilities The extra-cellular effects can be estimated in silico Can we model in vivo toxicity?
Administration vs. ADME ADME Effects OR OR Sc IP IV Stomach Vein Intestine Liver IV Toxic action Tissue, organs OR – Oral Sc – Subcutaneous IP – Intraperitoneal IV – Intravenous Dissolution, permeation, hydrolysis, metabolism
Complexity of ADME “Simple descriptors” “Simulations” Informatics ADME Specialists “Simple descriptors” disregard many factors. Can we simulate them in HT mode?
Oral %F Prediction in HT Mode Non-Batch Interface: Reliability validated by the consistency of independent predictions
Cost/Benefit Considerations • In silico Bioavailability and Toxicity predictions for compound collections are inexpensive to perform • The value of predictions is variable- Decisions still need to be made by expert scientists in a project context • In silico tools can assist the expert in a detailed evaluation of ‘hits’, ‘leads’ and ‘candidates’ but there is a need for: • Predictions for a range of toxicity types: • LD50 (oral, i.v.,s.c.) • Genotoxicity and Carcinogenicity • Organ specific Effects (e.g. hepatotoxicity) • Integration of the prediction software with databases containing the training data so that the availability and behaviour of similar compounds can be checked
Drug Design General Principles • Aim for low logP • Aim for low M.Wt. C. Hansch et. al. ‘ The Principle of Minimal Hydrophobicity in Drug Design’ J. Pharm. Sci., 1987, 76, 663 M.C. Wenlock et. Al. ‘Comparison of Physicochemical Property Profiles of Development and Marketed Oral Drugs’ J. Med. Chem., 2003, 46, 1250
Simulations in HT Screening HT Simulations aim at: %F High Activity = High %F + Low Tox Tox Activity “Reductionist” Methods: High Activity = Low %F + High Tox %F Very rough estimations, assuming that activity increases with increasing log P and MWt Tox Activity