220 likes | 365 Views
Getting Past Diversity in Assessing Virtual Library Designs. Bob Clark Tripos, Inc. St. Louis, Missouri USA. bclark@tripos.com www.tripos.com. 2001 Tripos, Inc. Where be the dragons?. Stylized data sets pyridine, pyrimidine & cyclohexane libraries semi-homologous “series”
E N D
Getting Past Diversityin Assessing Virtual Library Designs Bob Clark Tripos, Inc. St. Louis, Missouri USA bclark@tripos.com www.tripos.com 2001 Tripos, Inc.
Where be the dragons? • Stylized data sets • pyridine, pyrimidine & cyclohexane libraries • semi-homologous “series” • Nearest-neighbor profiles • problems & advantages of subsetting • 4-Ureidopiperidine Sulfonamides • combinatorial sub-libraries OptSim™ design • Fingerprint visualization • horizon NLM
R 3 R R R R R 2 2 3 2 3 R R R R 1 4 1 1 Cyclohexane, Pyrimidine and Pyridine Library Compositions* N N N Chex Pym Pyr Position All libraries Chex& PymPyr only R1 F, Br, NO2, Et H, Cl, CF3 none NMe2, Ac, COCF3 Me, iPr, SMe SPh, OPh, CH2Ph Ph R2 F, Et, CF3, COCF3 Br, NO2, NMe2 Cl, Me, SMe, Ph OPh, CH2Ph Ac, SPh R3 CF3, Ac, COCF3 F, Br, NO2 CN, CO2Me, CONH2 Et, NMe2, Ac SPh, OPh, CH2Ph R4 none none F, iPr, CF3, SMe Ac, COCF3, Ph SPh, OPh, CH2Ph *RD Clark. J Chem Inf Comput Sci1997, 37, 1181-1188.
Nearest Neighbor Database Comparisons(wrt UNITY 2D substructural fingerprints)* frequency (%) frequency (%) Chex Pym 0.271±0.05 Chex Pyr 0.311±0.04 NN similarity NN similarity * RD Clark. Relative and Absolute Diversity Analysis of Combinatorial Libraries.In: Combinatorial Library Design and Evaluation, pp 337-362; AK Ghose & VN Viswanadhan, Eds.; Marcel Dekker, New York, in press.
Asymmetry ofNearest Neighbor Profiles Pyr5500 Pyr500 0.932±0.05 Pyr500 Pyr5500 0.834±0.08 frequency (%) NN similarity
Nearest Neighbor ProfilesUsing Maximally Diverse Subsets* C D Pyr* Pyr* 0.544±0.02 Pyr2K* Pyr2K* 0.560±0.02 Pyr* Pyr 0.722±0.08 frequency (%) Pyr2K* Pyr2K 0.729±0.09 frequency (%) NN similarity NN similarity * RD Cramer, DE Patterson, RD Clark, F Soltanshahi & MS Lawless.J Chem Inf Comput Sci 1998, 38, 1010-1023.
4-Ureidopiperidine SulfonamideLibrary* Primary AminesSulfonyl chlorides Property cut-off passed cut-off passed structure -- 436 -- 178 mol. weight 200 361 350 163 mol. volume 190 Å3 363 255 Å3 165 cLogP 2.6 370 5.0 168 aromatic rings 1 394 2 171 combined -- 308 -- 154 *RD Clark, DE Patterson, F Soltanshahi, JF Blake & JB Matthew. J Mol Graph Modelling 2000, 18, 404-411.
Ureidopiperidine SulfonamideSublibraries • All were constructed using an extension of “standard” OptiSim™ selection technology • subsample size k = 5 • exclusion radius 0.10 • incremental pivot method • Sublibrary 1: Cherry picked • 200 diverse representative products • Sublibrary 2: four blocks, 10 x 5 each • 32 amines + 20 sulfonyl chlorides • Sublibrary 3: single 20 x 10 block • 20 amines + 10 sulfonyl chlorides
B1 B2 B1 B1 B1 B1 b21 b22 b23 B1 B2 B1 B2 OptiSim Design Scheme A1 A1 A1 A1 A1 A1 A1 a21 a22 a23 A2 A2 A2 A2 A2 a31 a32 a33 A3 B1 B2 B3 B4 b41 b42 b43 B1 B2 B3 b31 b32 b33 B1 B2 B3 B1 B2 A1 A1 A1 A1 A2 A2 A2 A2 A3 A3 A3 A3 B1 B2 B3 B4 B5 B1 B2 B3 B4 B5 B1 B2 B3 B4 B1 B2 B3 B4 B5 b51 b52 b53 A1 A1 A1 A1 A2 A2 A2 A2 A3 A3 A3 A3 a41 a42 a43 A4 A4 A4 a51 a52 a53 A5
Ureidopiperidine SulfonamideNearest Neighbor Profiles single block cherry picked cherry picked single block frequency (%) frequency (%) NN similarity NN similarity 0.74 ± 0.09 (median 0.72) 0.81 ± 0.09 (median 0.80)
Self-similarity Profiles forDiverse Subsets from Sub-libraries(20 compound subsets) frequency (%) frequency (%) NN similarity NN similarity cherry-picked: 0.52 ± 0.02 (median 0.515) four-block: 0.55 ± 0.02 (median 0.545) single block: 0.60 ± 0.05 (median 0.615)
Nearest Neighbor Profilesfor Diverse Subsets are Symmetric cherry picked four block four block cherry picked cherry picked single block single block cherry picked frequency (%) frequency (%) NN similarity NN similarity 0.61 ± 0.09 (median 0.61) 0.62 ± 0.09 (median 0.61) 0.63 ± 0.10 (median 0.58) 0.62 ± 0.11 (median 0.58)
PCA (Euclidean) NLM (Tanimoto)
Effect of Horizon Distance (cyclohexanes) 1 1 1 2 2 4 4 2 3 3 4 3 3 3 3 4 2 2 2 4 4 1 1 1
Homolosine Projection source: Cartography Laboratory Indiana State University www.indstate.edu/gga/gga_cart
PCA NLMwith Horizon 37 36 25 35 34 26 33
PCA NLMwith Horizon 22 39 23 38 32 24 30 27 31 28 29
Comparison of Sub-Libraries cherry picking four blocks single block 45 53 42 51 48 46
Comparison of Sub-Libraries cherry picking four blocks single block 55 54 41 53 42 51 43 50 44 45 49 47 46 48
Comparison of Sub-Libraries cherry picking four blocks single block 40 55 42 52 51 44 45 47 49 46 48
Acknowledgements • NIH SBIR grant 1R43GM58919 • David Patterson • Sr. Fellow • Fred Soltanshahi • Technologist • Trevor Heritage, VP Software R&D 1999 Tripos, Inc.
Take-home: fingerprint similarity is biologically relevant (good neighborhood behavior)