280 likes | 448 Views
A Web-Based Computational Tool for Combinatorial Library Design that Simultaneously Optimises Multiple Properties. Weifan Zheng , Sunny T. Hung, Joel T. Saunders, Stephen R. Johnson, George L. Seibel. A short paper: http://www-smi.stanford.edu/projects/helix/psb-online/. Outline.
E N D
A Web-Based Computational Tool for Combinatorial Library Design that Simultaneously Optimises Multiple Properties Weifan Zheng, Sunny T. Hung, Joel T. Saunders, Stephen R. Johnson, George L. Seibel A short paper: http://www-smi.stanford.edu/projects/helix/psb-online/
Outline • Library Design - Problem Definition • Criteria in Early Computational Techniques • Important Developability Parameters • Multifactorial Nature of Library Design • PICCOLO • Optimisation Protocol • Individual Penalty Terms and Their Definition • Snapshots of the Intranet-Based System • Conclusions
R2 R1 5x5 full combination ? Library Design - Problem Definition 10 x 10 => 5 x 5
Criteria Used in Early Computational Design Techniques • Diverse Design: • diversity analysis and void-filling • Targeted Design: • similarity to leads • docking to a binding site • predicted activity using QSAR/QualSAR models • Pphore models
Failure of Compounds in Development • Poor biopharmaceutical properties, 39% • Lack of efficacy, 29% • Toxicity, 21% • Market reasons, 6% - Venkatesh & Lipper, J. Pharm. Sci. 89, 145-154 (2000) “an efficacious but non-absorbed agent is no better than a well absorbed but in-efficacious one” -Curatolo W. Pharm Sci Tech Today 1, 387 (1998)
Developability Should Be Considered in Library Design To avoid serious ADME liabilities as early as possible in the drug discovery process • Empirical rules • Lipinski rules of 5 (MW, clogP, #HD, #HA) • Drug-likeness • Ajay & Murcko (JMC, 1998, 41, 3314-3324) • Sadowski & Kubinyi (JMC, 1998 , 41, 3325-3329)
Some Fundamental Properties Contributing to Pharmacokinetics (PK) • Aqueous solubility • Membrane passive permeability • Cytochrome P450 activities • Plasma protein binding • Efflux pumping and active transport • ...
Factors That Are Optimised • Similarity to leads • Reagent diversity/coverage • Product novelty with respect to the corporate compound inventory • Lipinski parameters • Liabilities against P450 enzymes • Aqueous solubility; [Permeability] • Molecular flexibility; MS redundancy; reagent price
R1 R2 PICCOLO: reagent PICking by COmbinatorial Library Optimisation R1 R2 Better Library Initial Library R1 R2 Optimal Library R1 Penalty Scores R2 P450 Activity Lipinski Properties Diversity Iteration
The Size of the Solution Space is Huge 50 Amines + 50 carboxylic acids • Total number of compounds 50 x 50 = 2500 • Total number of solutions for an 8 x 12 library 50!/(8!42!) * 50!/(12!38!) = 6.52 x 1019
Stochastic Optimisation to Sample the Solution Space Randomly Pick 5x5 Calc penalty scores for the trial solution & save scores Reagent Pool Enumerate Swap a Fraction of Reagents Y Save the trial solution Metropolis criteria? N Reject trial solution
Perturbation Scheme • Which R-group to perturb • bias toward the R-groups that need more sampling • Which new reagent to pick • uniform sampling by cycling through the selected R-group list • Which old reagent to kick out • randomly chosen
Total Penalty Score is the Weighted Sum of Individual Penalty Terms
Similarity to Leads • Esim(S) = Daylight Tanimoto “distances” between all the compounds in a given library and the lead, averaged over the size of the library • In case of multiple leads, the Tanimoto distance between a compound and the leads is defined as the nearest neighbour distance
N D S = opt 1 å ( ) d y, D - y Î y D Reagent Diversity: S-Optimal Criterion • Esdiv (S) = Reverse S optimal scores for all R-groups averaged over the number of R-groups D: a set of design points (i.e., the selected reagents) d(x, A): minimum TD between point x and set of points A
Product Novelty with Respect to Corporate Collection • All S.B. compounds were mapped onto a 6D cell space (PCA, or formed by selected features to distinguish biological activities) • Epn (S) = the smoothed average number of S.B. compounds in the neighbouring cells
Developability Penalty Scores • Lipinski Parameters • MW < = 500 • ClogP: -1 to 5 • NHD <= 5 • NHA <= 10 • P450s - non-inhibitory predicted by the P450 classifiers • Solubility - should be higher than a limit Each penalty term is the percentage of library compounds that violate the limits for each term
P450 Classifiers and Solubility Predictor • P450s: 2d6, 3a4, 1a2, 2c9 • dataset(2d6): Active: ~3500; Inactive: ~4000 • method: 3 layer ANN • FP: 20%; FN: 10%; Ambiguous - 12 - 18% • Solubility • N = ~550 • 3 layer ANN • rms error ~1.0 log unit
Conclusions • PICCOLO is an in-house library design system that can simultaneously optimise all the factors we care about • Important developability parameters are taken into account • Expandable to include other criteria • A Web based system being used by SB chemists worldwide
Acknowledgements Colleagues in Cheminformatics Department Ken Kopple Jie Liang (now at Univ. Illinois at Chicago) Medicinal Chemists Todd Graybill, Jian Jin , Ronggang Liu, Tom Ku, Dennis Yamashita, Scott Thompson, Jia-Ning Xiang