560 likes | 703 Views
ntologies. in biological chemistry. Kirill Degtyarenko Sergio Contrino. @ EBI. Links. COMe http://www.ebi.ac.uk/~kirill/come/ IntEnz http://www.ebi.ac.uk/intenz/ This talk and more info http://www.ebi.ac.uk/~kirill/biometal/ kirill@ebi.ac.uk. Ontologies for “biochemical” compounds.
E N D
ntologies in biologicalchemistry Kirill Degtyarenko Sergio Contrino @ EBI
Links • COMe http://www.ebi.ac.uk/~kirill/come/ • IntEnz http://www.ebi.ac.uk/intenz/ • This talk and more info http://www.ebi.ac.uk/~kirill/biometal/ kirill@ebi.ac.uk
Ontologies for “biochemical” compounds • Structure • Physico-chemical properties • Biological function • 2-D, 2.5-D or 3-D structure • Not deduced from the structure • Biochemical reactions, mostly • Structural
Physico-chemical property • Molecular property • Supramolecular property • System property • Reaction property
Molecular entity has… • Mass (“molecular weight”) • Size • Shape • Charge • Structure • One can derive many properties from known complete structure • Spectra ?
Molecular property Heat capacity Mass Net charge Shape Size Structure Geometry Connectivity Topography Method Calorimetry Centrifugation Crystallography Electrophoresis Isotope method Mass spectrometry Microscopy Spectroscopy
DISCOVERED BY DISCOVERED IN ISA IS BASED ON YIELDS IS USED TO INVESTIGATE FEATURES AFFECTS ISA Chandrasekhara Venkata Raman 1928 Raman effect vibrational spectroscopy Raman spectroscopy Raman spectrum Amide bands protein conformation protein secondary structure
Physico-chemical ontology • Physico-chemical property • Physico-chemical method
Chemical data • 0-D: ENZYME, ChemicalOntology • 2-D: COMPOUND, GSK, NIST • 3-D: MSD http://www.ebi.ac.uk/msd-srv/chempdb/
%observable universe (cf. "Lenin's definition of matter") <energy (has *no rest mass*) %electromagnetic radiation <photon %(other forms of energy, can elaborate later) <matter (has *rest mass*) %free elementary particles having non-zero rest mass (*molar!*) <elementary particle %electron %proton %neutron %molecular matter (here the chemical ontology starts) %grouped_by_composition %compound ; synonym:chemical substance <formula unit <molecular entity %atom <electron <nucleus <proton <neutron %element %atomic ion %atomic radical %molecule <group %molecular ion %molecular radical %noncovalent crystal molecule %ionic crystal molecule %metallic crystal molecule %covalent molecule %discrete covalent molecule %giant covalent molecule %coordination molecule (continued) %ion %atomic ion %molecular ion %radical %atomic radical %molecular radical %mixture <compound %heterogeneous mixture %colloidal suspension %liquid aerosol %solid aerosol %foam %emulsion %sol %solid foam %gel %solid sol %homogeneous mixture %solution <solute <solvent %solid solution ... %grouped_by_state_of_matter %plasma %gas %liquid %solid %heterogeneous mixture
Molecular ontology • noncovalent crystal • ionic crystal • metallic crystal • covalent • giant covalent • discrete covalent • coordination
Complex proteins • Metalloproteins • Organic prosthetic group proteins • Modified amino acid proteins • Proteins consisting of more than one polypeptide chain • Combinations of all groups
Bioinorganic motif (BIM) • A common structural feature of a class of functionally related, but not necessarily homologous, proteins, that includes the metal atom(s) and first coordination shell ligands [Degtyarenko K.N. (2000) Bioinformatics16, 851–864]
0 Fe D
1 Fe P-D-x(2)-H-[DE]-[LI]-[LIVMF]-G-H-[LIVMC]-P-x(n)-E D
2 D
2.5 D
3 D
Italian word come (how) English word come (not GO) Classification Of Metalloproteins COfactors and Metals COMplex proteins, etc. Co-Ordination of Metals in proteins http://www.ebi.ac.uk/~kirill/come/
year month COMe version 2.11 • Controlled vocabulary • No definitions • 1079 protein classes (PRX) • 351 bioinorganic motifs (BIM) • 132 small molecules (MOL) organised as: • XML version (master) • Oracle version
Relationships in COMe • isKindOf : inherits all attributes PRX to PRX ; BIM to BIM ; MOL to MOL • isPartOf : no inheritance BIM to BIM ; MOL to MOL ; MOL to BIM ; BIM to PRX • isBoundTo : no inheritance MOL to PRX
- <term value="rusticyanin" id="PRX000193"dbxref="InterPro:IPR001243"> </term> <bim coordination="T-4">Cu(ND.His)2(SD.Met)(SG.Cys)</bim> - <substructure id="BIM000085"> </substructure>
- <substructure id="BIM000245"> </substructure> <bim>heme(OD.Asp)(OE.Glu)(SD.Met)</bim> <term value="hemediol-L-aspartyl ester-L-glutamyl ester-L-methionine sulfonium" dbxref="RESID:AA0280" /> <term value="heme m" /> - <substructure id="BIM000246"> <bim>(CBB.heme)(SD.Met)</bim> </substructure> - <substructure id="BIM000247"> <bim>(CMD.heme)(OD.Asp)</bim> </substructure> - <substructure id="BIM000248"> <bim>(CMB.heme)(OE.Glu)</bim> </substructure>
- <substructure id="BIM000281"> </substructure> <bim>MIO</bim> <!-- originates from cyclization and dehydration of internal Ala-Ser-Gly--> <term value="3,5-dihydro-5-methylidene-4H-imidazol-4-one" lref=" MEDLINE:21462607" /> <term value="4-methylidene-imidazole-5-one" dbxref="PDB:1B8F" />
Verdict on structure representation in COMe • Metal-containing BIM Intuitively understandable • Organic prosthetic group Sometimes not • Modified amino acid(s) Not really… but we can link to RESID @ http://srs.ebi.ac.uk/
COMe: future work • More PRX, BIM, MOL • Building blocks for BIMs • Representation of inheritance • 2.5-D representation of BIM / MOL (?)
Biochemical reactions (I) • Enzymatic reactions • Non-enzymatic reactions
Biochemical reactions (II) • Binding • A + M A—M (A = “small molecule”) • Biotransformation • A + B C + D (A, B, C, D = small molecules) • Molecular transport • A(compartment X) A(compartment Y) • Electron and exciton transfer reactions • Conformation change (e.g. folding)
Biotransformation reactions • Catalytic Catalyst • Enzymatic protein • Ribozymatic RNA • Heterogeneous surface (e.g. metal) • Homogeneous solute (e.g. metal) • Non-catalytic • Photoinduced — • “Spontaneous” —
Aromatic amino acid hydroxylases EC 1.14.16.1 L-Phe + H4B + O2 = L-Tyr + H2B + H2O EC 1.14.16.2 L-Tyr + H4B + O2 = 3,4-dihydroxy-L-Phe + H2B + H2O EC 1.14.16.4 L-Trp + H4B + O2 = 5-hydroxy-L-Trp + H2B + H2O In fact... I. RH + tetrahydrobiopterin + O2 = ROH + 4a-hydroxytetrahydrobiopterin II. 4a-hydroxytetrahydrobiopterin = 6,7-dihydrobiopterin + H2O III. 6,7-dihydrobiopterin = 7,8-dihydrobiopterin
Alcohol(~2500 BC) • Bread(~2600 BC) • Cheese(~1000 BC) ABC of enzyme use
Enzyme Commission: established 1956 by IUB Now: NC-IUBMB Assigns EC numbers which were supposed to serve as unique identifiers of enzymatic reactions Classification by overall reaction catalysed (not by the reaction mechanism nor any other specific property of an enzyme) EC numbers form a strict hierarchy of ISA relationships http://www.ebi.ac.uk/intenz/ Enzyme Nomenclature
Overall transformations in “enzymatic reactions” EC1 Oxidoreductases EC2 Transferases EC3 Hydrolases EC4 Lyases EC5 Isomerases EC6 Ligases Aox + Dred Ared+ Dox A-X + B-H A-H + B-X A–B + HOH A–H + B–OH X–Y–Z X=Y + Z A B A + B + XTP A–B + XDP + Pi A + B + XTP A–B + XMP + PPi
Overall transformations in organic chemistry (after R.B. Grossman, 1999) • Addition • Elimination • Substitution • Rearrangement A + B A–B A–B A + B A–X+ B–Y A–Y + B–X A B
Reaction mechanisms in organic chemistry (after R.B. Grossman, 1999) • Polar • Polar acidic • Polar basic • Free-radical • Pericyclic • Metal-mediated and -catalysed
Mechanism of biochemical reaction • Mechanism of reaction irrespectively of catalyst • e.g. homolytic vs heterolytic bond scission • Mechanism of reaction according to catalyst nature • e.g. Cu-containing vs FAD-containing
oxalate + O2 = 2 CO2 + H2O2 EC 1.2.3.4 EC 1.2.3.4 EC 1.2.3.4 EC 1.2.3.4 • EC 1 Oxidoreductase • EC 1.2Acting on the aldehyde or oxo group of donors • EC 1.2.3 With oxygen as acceptor • EC 1.2.3.4 Oxalate oxidase Anatomy of an EC number EC 1.2.3.4
Redundancy and deficiency • Overall transformations in “enzymatic reactions” are mostly based on those of organic chemistry • EC3 (Hydrolases), many EC1 (Oxidoreductases) and some EC4 (Lyases) can be considered kind of EC2 (Transferases) • One part of EC6 (Ligases) reactions can be considered kind of EC3 (Hydrolases) • No classification for some fundamental reaction types, e.g. addition not to double bonds
EC 4.99 Other Lyases (example thanks to Keith Tipton) • EC 4.99.1.1 ferrochelatase protoporphyrin + Fe2+ = protoheme + 2 H+ creates metal–N bond • EC 4.99.1.2 alkylmercury lyase RHg+ + H+ = RH + Hg2+ breaks metal–C bond • Both these enzymes should not be classified as lyases
Other problems • No explicit difference between enzymatic reactions and enzymes is made, therefore • Some enzymes were given different EC numbers on the basis of different cofactor or origin (!) • The reactions are always written as if they were reversible, therefore • Enzymes catalysing the opposite reactions are given the same EC number
In fact... These may be different enzymes! EC 1.18.1.2 • Ferredoxin:NADP+ reductase [Fe2S2]+ferredoxin + NADP+ [Fe2S2]2+ferredoxin + NADPH • Adrenodoxin reductase [Fe2S2]2+adrenodoxin + NADPH [Fe2S2]+adrenodoxin + NADP+
EC 1.x.1 With NAD or NADP as acceptor EC 1.x.2 With a heme protein as acceptor EC 1.x.3 With oxygen as acceptor EC 1.x.4 With a disulfide as acceptor EC 1.x.5 With a quinone as acceptor EC 1.x.7 With an iron–sulfur protein as acceptor EC 1.x.6 With a nitrogenous group as acceptor EC 1.x.8 With a flavin as acceptor EC 1.x.99 With other acceptors Sub-subclasses in EC1.1–1.10
EC 1.1.1 With NAD or NADP as acceptor Current classification • ISA EC 1.1.1.2 alcohol dehydrogenase (NADP) • ISA EC 1.1.1.91 aryl-alcohol dehydrogenase (NADP) • ISA EC 1.1.1.97 3-hydroxybenzyl-alcohol dehydrogenase