300 likes | 511 Views
Privileged Substructures Revisited: Target Community-Selective Scaffolds. Jürgen Bajorath Life Science Informatics University of Bonn. Privileged Substructures.
E N D
Privileged Substructures Revisited: Target Community-Selective Scaffolds Jürgen Bajorath Life Science Informatics University of Bonn
Privileged Substructures • First postulated by Evans et al. in 1988 based on the observation that many cholecystokinin antagonists contained conserved substructures not frequently seen in other active compounds • Since then the search for target class-privileged chemotypes has continued in medicinal chemistry • Generally accepted definition: • Recurrent fragments in ligands of a given target family • Selective at the family level, but not for individual targets Evans BE et al. J. Med.Chem.1988, 31, 2235-2246
Privileged Substructures • Existence of truly target family-privileged substructures has remained controversial • Intrinsic limitation: Search for privileged substructures has been based on frequency of occurrence analysis of pre-selected substructures • Often drawn conclusion: Substructure might occur with high frequency among ligands of a particular target family but also act on other families
Privileged Substructures Are target family-privileged substructures truly privileged? Schnur DM et al. J. Med. Chem. 2006, 31, 2000-2009
Privileged Substructures Are target family-privileged substructures truly privileged? Schnur DM et al. J. Med. Chem. 2006, 31, 2000-2009
Peptidases GPCRs Kinases ... Changing the Analysis Concept • Do molecular scaffolds exist that exclusively occur in ligands of individual target families ? • Bemis & Murcko framework (scaffold) • Large-scale distribution in target families • Departing from frequency of occurrence analysis of pre-selected substructures • Systematic compound data mining taking all available activity annotations into account
Hierarchical Scaffolds Compound R-groups Framework 1 Linker Ring System 3 2 Bemis GW and Murcko MA. J. Med. Chem.1996, 39, 2887-2893
Public Data Source - BindingDB • BindingDB database: • Public repository of activity information of small molecules • ~31,000 compound entries with ~57,000 activity annotations • 17,745 compounds active against human targets extracted
Analysis Strategy - Compound Sets • Target pair sets: • Active compounds are organized into target pair sets • A set contains all compounds active against two individual targets (i.e. compounds might belong to multiple sets) • Binding DB target pair sets: • Sets obtained for 520 pairs of targets that share >= 5 compounds • 6,343 compounds active against 259 human targets • Pubchem confirmatory bioassays: • Only 3 relevant human target pairs meet the >= 5 compound criterion
1 2 3 4 Ser/Thr kinases Serine proteinases Tyrosine kinases 5 6 MMPs & CAs 7 8 Caspases 9 10 11 12 13 14 15 16 17 18 Compound-Based Target Network • 520 target pairs are visualized in a network representation • Nodes: targets • Edges: target pair sets • Edge width: number of shared compounds • Densely connected communities • 18 communities • >= 4 targets • Different target families
Community-Selective Scaffolds • 520 human target pair sets (6,343 BDB compounds; 259 targets); 18 target communities • 206 community-selective scaffolds: • Exclusively act in a single community • With 5 - 45 compounds/scaffold (av. ~12) • Yielding 147 distinct carbon skeletons (topological diversity)
Adding Selectivity Information • For each compound active against a target pair, its target selectivity (TS) is calculated as: • Compound |TS| values range from 0 to 6.86 • 0: equal potency, no selectivity • 6.86: potency difference of nearly 7 orders of magnitude, i.e. highly selective for one target over another • Selectivity profiles of scaffolds • Community-based • Target-based
Selectivity Profiles • Community-based selectivity profile: • For each scaffold found in a given community • All corresponding compounds active against any target pair in this community pooled • Median of their absolute TS values determined (median |TS|) • Target-based selectivity profile: • For each scaffold active against a given target • All corresponding compounds active againstthis target pooled • Selectivity against any other target calculated • Median of their TS values determined (median TS)
Community Selectivity of Scaffolds • Scaffold / Community heat map: • Columns: target communities • Rows: scaffolds • Color spectrum: median |TS| • Red: scaffold yields many compounds with different potency against individual targets • Yellow: scaffold does not yield selective compounds • Non-selective scaffolds • Occur in multiple communities • Community-selective scaffolds • Exclusively occur in one community
Target Selectivity of Scaffolds • Scaffold / Target heat map: • Columns: targets in a community • Rows: scaffolds • Cell: the scaffold represents >= 5 compounds active against the target • Color spectrum: median TS • Red (positive): more selective for the target over others in the community • Yellow (negative): more selective for other members of the community
Target Selectivity of Scaffolds Community 3: 16 serine proteases • Different scaffolds display same selectivity profile • e.g. Factor Xa/Thrombin • Scaffolds withno apparent target selectivity • Number of scaffolds per target varies • Factor Xa: 17; Thrombin: 18 • Tryptase: 0; Hepsin: 0
5.2 1 2 0 Target Selectivity Ranking • Community-selective scaffolds are ranked according to median |TS| 37 scaffolds at least half of compounds having >= 100-fold potency differences against >= 2 community targets 111 scaffolds with target-selective tendency
98: 1.10 3: 4.03 Rank Median |TS| DPP8 CA9 CA2 DPP4 CA1 CA14 CA12 CA5A CA7 CA5B CA4 CA6 CA3 Community-Selective Scaffolds • Color spectrum: median TSRed: high potential to yield target-selective compounds • Yellow: low potential
Highly selective for FXa over other serine proteases Selectivity Searching (MDDR) Thrombin FXa
Inhibit both caspase 3 and 7 with nM potency; ~200-fold selective over caspases 1, 6, 8 Selectivity Searching Caspase 7 Caspase 3
Extending the Analysis: ChemblDB • Recent public domain database: ChemblDB • ~500,000 compounds with activity information • 32,848 compounds with high-confidence annotations active against 671 human targets • High-confidence activity annotations: • Target confidence level: 9 • Interaction type: D(irect) ftp://ftp.ebi.ac.uk/pub/databases/chembl/latest/
ChemblDB BDB 3,589 17,745 32,848 ChemblDB BDB 1,409 6,291 12,902 ChemblDB vs. BindingDB • Comparison at different levels • Active compounds (human targets) • Scaffolds • Network • Community-selective scaffolds • Topologically distinct scaffolds Compounds Scaffolds
CDB tyrosine kinases BDB GPCRs ChemblDB vs. BindingDB • Comparison at different levels • Active compounds (human targets) • Scaffolds • Network • Community-selective scaffolds • Topologically distinct scaffolds shared targets unique targets
ChemblDB BDB 34 206 311 ChemblDB BDB 85 147 227 Community-selective Topologically distinct ChemblDB vs. BindingDB • Comparison at different levels • Active compounds (human targets) • Scaffolds • Network • Community-selective scaffolds • Topologically distinct scaffolds
Community-Selective Scaffolds • Distribution in drugs? • DrugBank: 1,247 approved drugs with 726 unique scaffolds • Only 11 overlap with 206 community-selective BDB scaffolds • Community-selective scaffolds currently underrepresented in drugs; opportunities for further chemical exploration
Conclusions • The existence of target class-privileged substructures has remained controversial over the years • From putative privileged substructures to confirmed target community-selective scaffolds through systematic data mining • Community-seletive scaffolds are abundant and topologically diverse • A subset of community-selective scaffolds displays a notable tendency to produce compounds with different target selectivity • BDB and CDB contain complementary target and scaffold information
Acknowledgments Ye Hu Anne Mai Wassermann Eugen Lounkine