460 likes | 569 Views
Understanding biosynthesis of complex metabolites using computational biology. D. Mohanty National Institute of Immunology New Delhi. Substrate Specificity of Catalytic Domains. Domains Involved in Protein-Protein Interactions. Levels of Functional Annotation.
E N D
Understanding biosynthesis of complex metabolites using computational biology D. Mohanty National Institute of Immunology New Delhi
Substrate Specificity of Catalytic Domains Domains Involved inProtein-Protein Interactions
Levels of Functional Annotation Sequence based methods: Fundamental for functional annotation Drawback: Cannot predict substrate specificity
PCPS SCoA Luciferin Oxyluciferin Amino acid Coumarate SCoA Amino acyl PCP Coumaroyl CoA Fatty acid Acyl CoA
D F G H Y K L M V C Knowledge Based Approach Computational Chemistry Homology modeling Model protein based on known structure of a similar protein Find the substrate which binds to the model protein Range of possible substrates
Knowledge Based Approach Sequence-Product correlation Predictive rules Sequence/structure information for large number of proteins with known specificity Predicting substrate specificity of new members of the family using evolutionary information Design of novel proteins with altered specificity. In silico identification of PKS/NRPS products Design of novel polyketides/nonribisomal peptides
BIOSYNTHESIS OF A MODULAR PKS rapamycin(immunosuppressant) erythromycinAantibacterial) rifamycinB(antituberculosis) KS:KETOSYNTHASE; AT:ACYL TRANSFERASE; DH:DEHYDRATASE; ER:ENOYL REDUCTASE; KR:KETOREDUCTASE; ACP:ACYL CARRIER PROTEEIN; TE:THIOESTERASE
ACYL TRANSFERASE (AT) DOMAIN Involved in selection of starter and extender units during Biosynthesis of Fatty acids and Polyketides Propionyl CoA POSSIBLE STARTER AND EXTENDER UNITS PKS FAS Isobutyryl CoA Acetyl CoA Acetyl CoA Malonyl CoA Butyryl CoA Benzoyl CoA Methylmalonyl CoA Acetyl CoA Acetoacetyl CoA Malonyl CoA
Yadav G, Gokhale R.S and Mohanty D. (2003) Nucl. Acids Res. 31:3654-3658 PKSDB
Substrate Specificity of AT domains of PKS Yadav, G., Gokhale, R. S. and Mohanty, D. (2003) J. Mol. Biol.328, 335-363. Trivedi et al. Mol Cell 2005, 17: 1-13
Classification of KS sequence into subfamilies: Modular or Iterative ? • KS domain dendrogram • Different sub-families of KS domain • sequences show distinct clustering. • How to Quantify this difference?
Identification of residues in KS which control number of iterations
Analyses of KS domains from iterative PKS • Threading of sequence onto known structural folds • Homology Models based on highest scoring structural templates • Active site residue extraction and analyses • Cavity Analyses of various models in terms of : Volume Hydrophobicity Topology • Comparison across iterative PKS subfamilies
The E.coli KAS-II Catalytic Pocket
Plots of iterative KS model cavity parameters Are we analyzing the correct cavity? Saturation of Product Vs Hydrophobicity of CLRs Cavity Volume Vs No. of Iterations
Shapes of Active Sites of Iterative PKSs MSAS NAPTHOPYRONE
CORRECT ORDER OF ORFs WITHIN A PKS BIOSYNTHETIC CLUSTER Simocyclinone PKS Mupirocin PKS cluster
INTERMODULAR DOCKING INTERACTIONS Cognate – Non Cognate Differentiation (An example):
PREDICTION OF ORF ORDER USING LINKER INTERACTIONS The Spinosyn Biosynthetic cluster ORFs: 1 2 3 4 5 Charged Interaction (+)Bad Interaction(-)Neutral Interaction (.)
Consensus Active Site Patterns Of Six Subfamilies Lys 517 3.02 Å Gly 324 4.82 Å * No consensus Specificity determining residues (SDR) Active site pattern
SEARCHNRPS Ansari MZ, Yadav, G., Gokhale, R. S. and Mohanty, D. (2004) Nucl. Acids Res. 32:W405-13 .
How good are models at such low sequence homology ? Genetic algorithm for 250 runs. Grid size: 22.5 X 22.5 X 22.5 (Å)3 Cluster rmsd :1.7Å Major cluster: 223 Minor cluster: 27 Crystal structure of the long chain CoA ligase (1V26) Model of the same protein based on 1AMU 10/18 Ligand Rmsd = 2.3 Å
Glycosyltransferases (GT), enzymes that transfer sugars to other molecules. R’ = Sugar, Lipid, Protein, DNA Secondary metabolites R=nucleoside nucleoside monophosphate
GTr sequences cluster according to substrate specificity Vancomycin group Prediction accuracy = 77% Orthosomycin group Aminoglycoside AB Hybrid NRPS-PKS Polyene macrolide AB Enediyne group Angucycline AB Aminocoumarin AB Anthracycline group Macrolide group Aureolic acid AB
246 314 141 218 245 293 311 313 102 103 296 317 332 309 73 76 61 11 12 331 143 80 67 55 59 60 13 10 15 62 65 66 68 9 8 Y Y F M M L P P R P P P M L V Q K R L L K M G G D D D L I M T T T T T S N N M C T S P L L E E Q D P P T S A D D V L K S P K G G G G G G G E E E G G G R R R L L I S S S G G G V V V H G A S S S G G G T T U F F H V L A E E E 294 166 1RRV Best Match Query Acceptor Binding Residues Donor Binding Residues CGTDLMMLQMPPPELTGDDPYNT SGSDMLGKRVPRLQSAGVPGFMT TRGELGSEVFHSGTV SRGEIGSEVHASGUA Identifying substrate (donor/acceptor) binding residues N-Domain DVV and its binding residues TYD and its binding residues Linker C-Domain
Benchmark the Prediction Accuracy GtfD Correctly predicted the same site and 90% of the donor binding residues GtfA C % identity ~ 55% C % identity ~ 20% MurG Correctly predicted approximately the same site and 50% of the donor binding residues C
Organization of SEARCHGTr and its backend database GTrDB amino acid Kamra P, Gokhale RS and Mohanty D (2005) Nucl. Acids Res., In Press
CAT NRPS CRAT PREDICT STRUCTURAL FOLD AND CORRELATE WITH KNOWN CHEMISTRY PapA5 CAT Fold E2p BAHD
CAT Superfamily NRPS Domains B A H D C R A T CAT N R P S Epi Cyc Con D-X L-X
Identification of crucial residues involved in protein-protein interaction PapA5 (Crystal Structure) MAS ACP (Homology Model) PapA5 Protein Docking Mutational studies of these crucial residues WT R234E R312E Trivedi et. al.Mol. Cell. (2005) 17, 631-643
Acknowledgements Gitanjali Yadav Md. Zeeshan Ansari Pankaj Kamra Dr. Rajesh S. Gokhale Chemical Biology Group Dr. S.K. Basu, Director, NII BTIS, DBT, India
Coumarate CoA Ligase Substrates of Coumarate CoA Ligases Coenzyme A Cinnamate Coumarate Caffeate Sinapate 3,4-DMC Ferulate
Adenylation domain of NRPS Substrates of NRPS PCP Domain
Coenzyme A Substrates of Fatty Acid CoA Ligases Fatty acid CoA Ligase Acetic acid n ~ 4 - 8 : Medium chain fatty acid n ~ 5 -11: Long chain fatty acid n > 11 : Very Long chain fatty acid Enzymic activation and transfer of fatty acids as acyl-adenylates in mycobacteria Trivedi, O.A., Arora, P., Sridharan, V., Tickoo, R., Mohanty, D. and Gokhale, R.S. 2004 Nature 428:441.