380 likes | 671 Views
The importance of enzymes and their occurrences: from the perspective of a network W.C. Liu 1 , W.H. Lin 1 , S.T. Yang 1 , F. Jordan 2 and A.J. Davis 3 , M.J. Hwang 1 .
E N D
The importance of enzymes and their occurrences: from the perspective of a network W.C. Liu1, W.H. Lin1, S.T. Yang1, F. Jordan2 and A.J. Davis3, M.J. Hwang1. 1Institute of Biomedical Sciences, Academia Sinica, Taiwan. 2Collegium Budapest, Hungary. 3Max Planck Institute for Chemical Ecology, Germany.
Introduction What is a metabolic network? The textbook version as defined by human! Nodes: compounds Links: enzyme-catalyzed reactions From http://www.genome.jp/kegg/
Introduction What is a metabolic network? The sum of all chemical transformations in a cell. Enzyme-catalyzed reactions. Reversible and irreversible reactions. All living things have it!!. So, metabolic networks consist of compounds (i.e. substrates and products) and enzymes.
Introduction Why do we study metabolic networks? Obviously, they are popular and fashionable topics! Global properties and connectivity (Jeong et al 2000, Nature, 407:650) Network organization and modules: (Ravasz et al 2002, Science, 297:1551) All living organisms rely on metabolism, thus they are important!
Introduction Why do we study metabolic networks? Of course, we have better reasons than those two! Metabolic pathways are chosen due to their relative completeness and the amount of data in comparison with other molecular networks……
Introduction Why do we study metabolic networks? Given the same metabolic pathway, some enzymes are present in some organisms while some are absent in others. E. coli B. halodurans From http://www.genome.jp/kegg/
Introduction Why do we study metabolic networks? Distribution of enzymes in 228 species of bacteria
Introduction Questions Given the same metabolic pathway, why the differential occurrence of enzymes in different organisms? Can this difference be explained by network topology: Are some enzymes more important topologically from the perspective of an enzyme network, while others less so?
Methodology A metabolic network consists of several sub-networks. As a first step, we focus on one smaller and better known component, glycolysis, then expand from there. We focus on organisms that are comparable with each other. i.e. bacteria, due to a large number of data available.
Methodology Basic outline: 1. From the KEGG website (http://www.genome.jp/kegg/), for a given enzyme, we determine how many bacterial species have this particular enzyme (we do this for all enzymes). Let this be the frequency of occurrence. 2. From KEGG website we extract information on glycolsis for 228 bacterial species to construct a reference enzyme network. A reference network is thus a summation of all 228 bacterial species. We assume that a reference pathway contains all of the biologically possible nodes and links in 228 bacterial species. 3. Determine the topological importance (to be defined later) of enzymes. 4. Analyze results from 2 and 3, test whether topological importance is associated with frequency of occurrence.
A B C E1 E2 E1 E2 Methodology Details: How to define a link between enzymes? If two enzymes are involved in successive reaction steps, then a link between those two particular enzymes. Consider an hypothetical reaction: Assumption: we ignore link directions.
Methodology Details: How to define importance or topological properties? Several network indices: Degree or connectivity, Closeness centrality, Betweenness centrality.
D=2 D=1 D=4 Methodology Details: Degree, D The number of direct neighbors
Methodology Details: Closeness centrality, C (a simplified version) Cimeasures how short the shortest paths are from node i to all nodes where i≠j, dij is the length of the shortest path between nodes i and j in the network, N is the number of nodes.
Methodology Details: Between-ness centrality, B Bimeasures how often a node i occurs on all shortest paths between two nodes where i ≠ j and k, gjkis the number of equally shortest paths between nodes j and k, gjk (i) is the number of these shortest paths to which node i is incident, N is the number of nodes.
Methodology • Details: • Rank enzymes according to topological importance values and frequency of occurrence. • Rank correlation test.
Results Frequency of occurrence: We searched the KEGG website for enzymes involved in glycolysis for 228 bacterial species.
Results Enzyme network for glycolysis: 33 Nodes (Nenzyme), 61 Links (Lenzyme)
Results Enzyme Network: Topological Importance: Degree
Results Enzyme Network: Topological Importance: Closeness Centrality
Results Enzyme Network: Topological Importance: Betweenness Centrality
Results Enzyme Network: Topological Importance: Comparison with random networks Can those topological properties tell us about the structure of glycolysis enzyme network? i.e. is it different from random networks? We expect the importance values will be more homogeneous for a random network. For the enzyme network, means and variances of above three properties are calculated. Construct 1000 random networks of the same size as our enzyme network. For each network, means and variances of above three properties are calculated. One then asks how well the random networks can reproduce the means and variances of the three topological properties of our enzyme network.
Results Enzyme Network: Topological Importance: Comparison with random networks Degree can not discriminate our enzyme network from random ones, while both centrality properties can! Degree Betweenness centrality
Results Enzyme Network: Analysis: Rank correlation between frequency of occurrence and topological importance values Frequency of occurrence vs Degree rrank=0.14 p=0.42
Results Enzyme Network: Analysis: Rank correlation between frequency of occurrence and topological importance values Frequency of occurrence vs Closeness centrality rrank=-0.56 p=0.0009
Results Enzyme Network: Analysis: Correlation between frequency of occurrence and topological importance values Frequency of occurrence vs Betweenness centrality rrank=0.63 p=0.0003
What have we learned so far…. Degree or connectivity is not so useful as it only considers very local information. From the enzyme network, other semi global measures of topological importance seem to correlate with the frequency of occurrence. Our conclusions are only from a topological point of view, at least for glycolysis and 228 bacterial species. We are aware of the simplicity of our models and assumptions. The need to look at other metabolic pathways, and ultimately the whole metabolic network.
The carbohydrate enzyme network From glycolysis to carbohydrate metabolism 339 enzymes 1106 interactions Blue nodes are those involved in glycolysis
From glycolysis to carbohydrate metabolism Similarly to the findings from previous section, closeness and betweenness centralities are good topological properties that tell us our enzyme network is far from random. Furthermore, degree can do the same too! Degree
From glycolysis to carbohydrate metabolism Rank correlations between frequency of occurrence and topological properties
From glycolysis to carbohydrate metabolism Contrast to our trial findings, degree here seems to be a good network property that correlates with the frequency of occurrence. Why is degree important? Probably due to preferential attachment during network evolution. Closeness and Betweenness centralities still behave in the same manner as before but to a lesser extent. Thus, the size or scaling of networks might be influential.
The carbohydrate enzyme network From glycolysis to carbohydrate metabolism Looking at the glycolysis enzyme network from the perspective of carbohydrate metabolism. Glycolysis is no longer a closed system. We identify enzymes that are involved in glycolysis, and have their topological properties determined while treating glycolysis as a sub-network connected to the whole carbohydrate metabolism.
From glycolysis to carbohydrate metabolism Rank correlations between frequency of occurrence and topological properties Sub as a sub-network of the crabohydrate enzyme network
From glycolysis to carbohydrate metabolism Thus, our results depend on whether the glycolysis enzyme network is a closed/isolated system, or as a part of the large carbohydrate enzyme network. One explanation is that an enzyme might appear in different pathways, such that distant enzymes will be brought in the vicinity of each other in the network; or an enzyme might be important topologically in one pathway, but less so in others. Should the same enzyme appearing in different pathways be treated as the same node? Further investigation is required.
Conclusion Topological properties might play a role in the conservation of different enzymes in different bacteria species. Betweenness centrality is probably an important property! It might identify enzymes that occupy network positions such that metabolites can be converted one to another efficiently, and also identify redundant enzymes (by-pass) that occupy more central positions in the network. Due to the size/scale issue, there is a need to expand our enzyme network to the whole metabolic network. Further investigations are required to examine the relationship between larger networks and their components (i.e. sub networks).