270 likes | 399 Views
Synthetic Sprout. Generating Synthetically Accessible Ligands by De Novo Design. A Peter Johnson Krisztina Boda Attilla Ting Jon Baber. SPROUT is the De Novo design system developed in Leeds. SPROUT components
E N D
Synthetic Sprout Generating Synthetically Accessible Ligands by De Novo Design A Peter Johnson Krisztina Boda Attilla Ting Jon Baber
SPROUT is the De Novo design system developed in Leeds SPROUT components • Identification of potential interaction sites complementary to the receptor, ie H bonding, hydrophobic sites, metal co-ordination sites etc. • Automated docking of small fragments at the interaction sites. • Generation of hypothetical structures by linking the docked fragments together. • Tools for scoring, sorting and navigating the answer set.
Hydrogen Bond Sites H-bond acceptor site H-bond donor site Example: 3D shapes of sites
Docking of small fragments at target sites Target sites are generated either by SPROUT module HIPPO (or similar system) or come from a pharmacophore hypothesis. Small fragments with complementary functionality are selected by the user and automatically docked into the target site(s). In addition to these small fragments, it is also possible to dock large fragments which are known to satisfy several of the target sites. Such a large fragment can then act as a “seed” for further growth. A successful dock must place the small fragment at the target site with the correct orientation to satisfy any directional constraints. The docking process is very fast and uses a novel hierarchical least squares optimisation procedure.
Structure generation The SPIDER module links the target sites together in a pairwise fashion to make complete molecular structures which satisfy target sites. It does this by sequentially adding new fragments in an exhaustive fashion. There is no element of random choice in this process, which means that various heuristics have to be adopted to avoid a combinatorial explosion. The main approximations employed are: There is a sampling of all the possible conformations about single bonds. Growth is only permitted from atoms/bonds which are closest to the target site which is to be reached
Main algorithm of SPIDER • Multiphase heuristic graph search on a forest ( set of trees) • Two trees are searched and removed in each phase and a new tree generated which contains skeletons connections both set of sites • Each phase consists of • a bi-directional search • Breadth First Search (BFS) • Depth First Search (DFS) Typical saving bi-directional search 10 successors, 6 level: 2x103 << 106
Connection of Partial Structures • Common template is located in two structures (one from each tree) • Structures are overlayed by the common template • Combined structure is docked to the united set of target sites also considering the steric constraints of the receptor site • Side effect joins are axamined for validity (e.g. fusion on figure)
Navigating the answer sets Estimated binding energy score • Ranking final de novo set • Ranking and pruning (with caution) intermediate trees to reduce combinatorial problem. Estimated ease of synthesis score • Ranking final de novo answer set • Too slow (~1 structure per minute) to be useful for intermediate pruning • Need faster methods for intermediate pruning
Recent Advances • Parallelization of structure generation • Farm of SG’s or pcs • SPROUT server – BEOWOLF cluster currently 11 dual processor 600Mhz Pentium III • VLSPROUT screens virtual libraries • SYNSPROUT generates synthetically accessible ligands • Receptor SPROUT generates potential synthetic receptors for small movecules
The perennial modellers problem Hypothetical ligands, including those predicted to bind very strongly, have no practical value unless they can be readily synthesised. Our attempts to provide solutions: CAESA post design estimation of synthetic accessibility SynSPROUT synthetic constraints built into the de novo design process VLSPROUT even greater synthetic constraints – only members of a specific virtual library are generated
Synthetic Sprout Approach Pool of readily available starting materials, e.g. subset of ACD Knowledge Base of reliable high yielding reactions, e.g. esterification, amide formation, reductive amination.. VIRTUAL SYNTHESIS IN RECEPTOR CAVITY Readily synthesable Putative ligand structures
Creation of Starting Material Libraries • Obvious Classes eg amino acids • “Drug like” starting materials selected by hand • “Drug like” starting materials generated automatically by retrosynthetic analysis of drug databases
EXPLANATION Amide Formation IF Amide THEN disconnect bond between 1 and 3 add-atom O[Hs=1] to 1 with – add-hydrogen to 2 END-THEN EXPLANATION Amide Formation IF Amide THEN disconnect bond between 1 and 3 add-atom O[Hs=1] to 1 with – add-hydrogen to 2 END-THEN EXPLANATION Amide Formation IF Amide THEN disconnect bond between 1 and 3 add-atom O[Hs=1] to 1 with – add-hydrogen to 2 END-THEN EXPLANATION Amide Formation IF Amide THEN disconnect bond between 1 and 3 add-atom O[Hs=1] to 1 with – add-hydrogen to 2 END-THEN EXPLANATION Amide Formation IF Amide THEN disconnect bond between 1 and 3 add-atom O[Hs=1] to 1 with – add-hydrogen to 2 END-THEN EXPLANATION Ether Formation IF Ether THEN disconnect bond between 2 and 3 add-atom O[Hs=1], Cl, Br 3 with – add-hydrogen to 2 END-THEN EXPLANATION Ether Formation IF Ether THEN disconnect bond between 2 and 3 add-atom O[Hs=1], Cl, Br 3 with – add-hydrogen to 2 END-THEN EXPLANATION Ether Formation IF Ether THEN disconnect bond between 2 and 3 add-atom O[Hs=1], Cl, Br 3 with – add-hydrogen to 2 END-THEN EXPLANATION Ether Formation IF Ether THEN disconnect bond between 2 and 3 add-atom O[Hs=1], Cl, Br 3 with – add-hydrogen to 2 END-THEN EXPLANATION Ether Formation IF Ether THEN disconnect bond between 2 and 3 add-atom O[Hs=1], Cl, Br 3 with – add-hydrogen to 2 END-THEN Retro-Synthetic Knowledge BaseRetro-Synthetic Rule
Automatic Template LibraryGeneration • Perception • Knowledge • Bases • Aromatic • Normalisation • Hybridisation • H-bonding • properties 2D Drug-like Structures Ring Perception Fragmentation Retro-Synthetic Knowledge Base Clustering Retro-synthetic patterns Filter Retro-Synthetic rules Single 3D Conformer Generation Corina • Synthetic • Knowledge Base • Functional groups Multiple Conformer Generation Omega Synthetic Template Library
Information Perceived Aromatic atoms and bonds Normalised bonds Hybridisation including induced hybridisation H-Donors / Acceptors Number of hydrogens attached to an atom Number of connections to an atom Number of available electron pairs Charge at an atom Automatic Chemical Perception Rule based system where rules are encoded using the PATRAN language (similar to SMILES) Example from Hybridisation knowledge base CHEMICAL-LABEL <NitrogenWithLP--SP2> X[SPCENTRE=2]-N[HS=0,1,2];[SPCENTRE=3] EXPLANATION N with lone pair next to sp2 centre behaves as sp2. IF NitrogenWithLP--SP2 THEN set-av-eps 2 to 0 set-hybridisation 2 to 2 END-THEN
Perception - Binding Properties • O Single atom based • Vs • C Functional group based • D - H donor • A - H acceptor • J - Joinable* • H - Hydrophobic • N - None O - original method C - current method * According to reaction knowledge base
Synthetic Template Primary Amine (Donor) D A A A H AD A H A Phenol (Acceptor-Donor) Carboxylic Acid (Acceptor)
Synthetic Knowledge BaseSynthetic Rules EXPLANATION Amide Formation 1 IF Carboxylic Acid INTER Primary Amine THEN destroy-atom 3 form-bond - between 1 and 5 change-hybridization 5 to SP2 Dihedral 0 0 Dihedral 0 180 Bond-length 1.35 END-THEN • Joining Rules • Steps of formation • Hybridization change • Bond type • Bond length • Dihedral angles/penalties
De-novo DesignUsing Synthetic Sprout Donor site Acceptor Site 1.Amide Formation ( Carboxylic Acid -Primary Amine ) 2.Reductive Amination ( Carbonyl - Primary Amine )
SP3 SP2 New Problems - Hybridisation change (SP3 SP2) Secondary Amine Nitrogen becomes SP2 Hybridisation change in Amide Formation 2. ( Carboxylic Acid - Secondary Amine )
SP2 SP3 Hybridisation change (SP2 SP3) Carbonyl Carbon becomes SP2 Hybridisation change in Reductive Amination 1. ( Carbonyl - Primary Amine )
Selection of Synthetic Reactions • Amide Formation • Ether Formation • Ullman reaction • Amine Alkylation • Ester Formation • Aldol • Wittig • Imine • C-S-C Formation • Reductive Amination
Act Score : -7.80 4 Docked:890 Docked:358 3 1079 5 71 1534 2 1 Docked:780 Docked:935 1 Amide Alkylation 2 ( Secondary Amide – Primary Alkyl Halide ) 2 Wittig Reaction ( Carbonyl = Primary Alkyl Halide ) 3 Ether Formation 1 ( Alcohol - Alcohol ) 4 & 5 Amine Alkylation 1 (Primary Amine - Primary Alkyl Halide ) CDK2 Library : 300 fragments/1055 conformations Run time : 10 h
SynSPROUT Current status Works well for small starting material libraries (low hundreds). Several libraries now built including amino acid library for peptide generation. Library from MDDR being built. Potential for suggesting starting points for new combinatorial libraries Future work Extend types of chemistry allowed Develop algorithms which would permit the use of libraries of hundreds of thousands of starting materials (such as ACD). Parallelisation helps but on its own is not sufficient to cope with the inevitable combinatorial explosion.
Acknowledgements Co-workers : Krisztina Boda Attilla Ting Jon Baber Special thanks to Open Eye Scientific Software for providing access to OMEGA