360 likes | 482 Views
From Structure to Function. Janet Thornton European Bioinformatics Institute. From Structure to Functional Annotation. Mid-West Center for Structural Genomics (MCSG). University of Toronto Aled Edwards. Argonne National Laboratory Andrzej Joachimiak. Northwestern University
E N D
From Structure to Function Janet Thornton European Bioinformatics Institute
Mid-West Center forStructural Genomics (MCSG) University of Toronto Aled Edwards Argonne National Laboratory Andrzej Joachimiak Northwestern University Wayne Anderson EBI / University College London Janet Thornton, Christine Orengo University of Washington at St Louis Daved Fremont University of Virginia Wladek Minor UT Southwestern Medical Center Zbyszek Otwinowski
ylxR hypothetical cytosolic protein ygbM hypothetical protein (EC1530) Hypothetical protein (MTH1) Conserved hypothetical protein (MT777) Hypothetical protein (EC4030_F) cutA protein implicated in Cu homeostasis (TM1056) ~30% are ‘hypothetical proteins’ Some examples … 60 structures solved to date
TIM barrel enzymes – 18 different homologous families>60 different E.C. numbers Structure of TIM barrel: Triose phosphate isomerase EC Wheel of TIM barrels
Pairwise sequence identity and conservation of enzyme function (Todd et al 2001) • Single-domain proteins: >81,000 homologous enzyme / enzyme and enzyme / non-enzyme pairs Fractional percentage
From Structure To Biochemical Function Gene Protein 3D Structure Function Given a protein structure: • Where is the functional site? • What is the multimeric state of the protein? • PQS – Hannes Ponstingl (this morning) • Which ligands bind to the protein? • What is biochemical function?
Automated Structure Comparison • The most powerful method for assigning function from structure is global or partial 3D structure comparison (e.g. Dali, SSAP; SSM) • Hidden Markov Models derived from structural domains can often recognise distant relatives from sequence • Christine Orengo (tomorrow)
Aspartate Amino Transferase Superfamily Aspartate Aminotransferase Tyrosine Phenolyase 2,2-Dialkylglycine Decarboxylase Ornithine Decarboxylase
Aspartate Amino Transferase Superfamily 77 11 73 4.1.99.2 6 2.6.1.1 76 10 7 76 Tyrosine Phenolyase Aspartate Aminotransferase 9 79 7 77 4.1.1.64 4.1.1.17 2,2-Dialkylglycine Decarboxylase Ornithine Decarboxylase
Aspartate Amino Transferase Family 4.1.1.64 2,2-Dialkylglycine Decarboxylase all bind Pyridoxal 5’ Phosphate (PLP) co-factor 4.1.1.17 Ornithine Decarboxylase 4.1.99.2 Tyrosine Phenolyase 2.6.1.1 Aspartate Aminotransferase
Number of enzyme functions TIM barrel glycosyl hydrolases / hydrolases type I PLP-dependent enzymes
Convergent and Divergent Evolution • Unrelated proteins can perform the same function (convergent evolution), sometimes using the same mechanism – sometimes using different mechanisms • Related proteins can perform different functions – divergent evolution
Active site convergence Trypsin Subtilisin
Trypsin Subtilisin Alpha/beta hydrolase Brain platelet activating factor acetylhydrolase Clp protease CheB methylesterase
Surface clefts Residue conservation Most likely binding site Conserved surface patches Binding-site analysis: cutA Predicting Binding Site
Identifying Binding Site Function Using Motifs - 3D enzyme active site structural motifs (Craig Porter) - Catalytic Site Atlas - Identification of catalytic residues (Gail Bartlett, Alex Gutteridge) - Metal binding sites (Malcolm MacArthur) - Binding site features (Gareth Stockwell) - Automatically generated templates of ligand-binding and - DNA binding motifs (Sue Jones, Hugh Shanahan) - “Reverse” templates (Roman Laskowski) JESS – fast template search algorithm (Jonathan Barker) PINTS - Searches for similar clusters (Aloy, Russell … – EMBL Heidelberg))
Catalytic Site Atlas Enzyme reports from primary literature information • -lactamase Class A • EC: 3.5.2.6 • PDB: 1btl • Reaction: -lactam + H2O -amino acid • Active site residues: S70, K73, S130, E166 • Plausible mechanism:
3-D templates • Use 3D templates to describe the active site of the enzyme • analogous to 1-D sequence motifs such as PROSITE, butin 3-D • Sequence position independent • Captures essence of functional site in protein
Wallace et al., 1997 TEmplate Search and Superposition TESS • defines a functional site as a sequence-independent set of atoms in 3-D space • search a new structure for a functional site • search a database of structures for similar clusters e.g. serine proteinase, catalytic triad
Aspartic Proteinase - Active Site residues - [DTG]x2 Eukaryotic & Fungal Aspartic Proteinases: all-atom DTG-DTG Template
Aspartic Proteases: Active Site Template Asp CO2 Gly C A template of 8 atoms is sufficient to identify all Aspartic Proteinases Asp O Gly C Thr/Ser O Thr O
Aspartic Protease Template Search against all PDB green= true red=false
(~600 Metal binding site templates) (189 enzyme active site templates) Template searches 3D Templates to Characterise Functional Sites
GARTfase Cholesterol oxidase IIAglc histidine kinase 189 templates Database of enzyme active site templates … Carbamoylsarcosine amidohhydrase Ser-His-Asp catalytic triad Dihydrofolate reductase
MCSG structure BioH – unknown function involved in biotin synthesis in E.coli Expected to be an enzyme Sequence contains two Gly-X-Ser-X-Gly motifs typical of acyltransferases and thioesterases An example Structure: Rossmann fold, hence many structural homologues
Ser-His-Asp catalytic triad of the lipases with rmsd=0.28Å (template cut-off is 1.2Å) Experimentally confirmed by hydrolase assays Novel carboxylesterase acting on short acyl chain substrates CSA template search One very strong hit
Templates of Active Sites • Catalytic cluster conserved – Simple template • e.g. Aspartic Proteinase (DTG)x2 • Order and geometry of catalytic residues varies • Multiple templates e.g. Polymerases • Same catalytic cluster used in many different enzyme functions – one template identifies multiple active sites in unrelated structures • eg Asp/His/Ser catalytic triad is well conserved in structure
Instances of convergence • Ser-His-Asp triads • Cys-His-Asp triads • Ribonuclease T1s • Malic enzyme and isocitrate dehydrogenase • Haloperoxidases • Creatinase and carboxypeptidase G2 • Glycosidases • Class II extradiol-type dioxygenase and class III extradiol-type dioxygenase • Receptor tyrosine phosphatase and low-molecular weight tyrosine phosphatase • Pyridoxal 5' phosphate enzymes James Torrance
Template databases • HAND CURATED • Enzyme active sites (PROCAT) – 189 templates • Currently being extended • Metal-binding sites – 600 templates • AUTOMATED • Ligand-binding sites – 10,000 templates • DNA-binding sites – 800 templates
Another example of convergent evolution: The DNA HTH Binding Motif 1b9m 1eto 1hcr 1ais Sue Jones 1jhg 1lmb 1orc
ProFunc – function from 3D structure Homologous structures of known function Homologous sequences of known function DNA-, ligand- binding and “reverse” templates Functional sequence motifs Q-x(3)-[GE]-x-C-[YW]-x(2)-[STAGC] HTH-motifs Electrostatics Surface comparison … etc Binding site identification and analysis Enzyme active site 3D-templates Residue conservation analysis
Three MCSG Examples(James Watson) Three examples show the varying levels of information that can be retrieved from structures: 1. Almost full functional information. GOOD • APC 1040 2. General information. NOT SO GOOD • APC 012 3. Little or no information obtained. UGLY • APC 078
Acknowledgements • Roman Laskowski, James Watson, Richard Morris, RafaelNajmanovich, Fabian Glaser - EBI • Christine Orengo, Annabel Todd, James Bray, Russell Marsden – University College, London • MCSG members – Andzrej Jaochimiak, Al Edwards etc • Funding: NIH - PSI; EU - SPINE; DoE – DNA Motifs; UK BBSRC LINK