450 likes | 662 Views
Bioinformatics and Evolutionary Genomics : Pathway evolution. What is a pathway ? -An ordered set of proteins and substrates (boundaries) -A graph -A system (systems biology) (includes a notion of function, regulation)
E N D
What is a pathway ? -An ordered set of proteins and substrates (boundaries) -A graph -A system (systems biology) (includes a notion of function, regulation) -A set of proteins that “do something together” (includes complexes, regulatory and signalling pathways), a.k.a. a functional module -A set of proteins that are co-regulated, or behave similarly in evolution
Tracing the evolution of NADH:ubiquinone oxidoreductase (Complex I of the oxidative phosphorylation), from 14 subunits (Bacteria) to 46 subunits (Mammals) by comparative genome analysis Fungi: 37 Mammals: 46 Bacteria: 14 subunits Plants: 30 Algae: 30
Fungi Mammals Arthro. Plants/Algae Distribution of Complex I subunits among model species, in red identified at the protein level (exp.), in yellow at the gene level.
Mammals Insects Fungi Plants/Algae Distribution of Complex I subunits among model species, in red identified at the protein level (exp.), in yellow at the gene level, in white at the DNA level.
Reconstructing Complex I evolution by mapping the variation onto a phylogenetic tree. After an initial “surge” in complexity (from 14 to 35 subunits in early eukaryotic evolution) new subunits have been gradually added and incidentally lost. Complex I loss is not always “complete”, S.cerevisiae and S.pombe have retained 1 and 3 proteins Six of the eukaryotic Complex I proteins have been “recruited” from the alpha-proteobacteria
Beyond Blastology, Cogoly: Phylogenies for orthology prediction The Complex I assembly protein CI30 has been duplicated in the Fungi. This can explain the presence of a CIA30-homolog in Complex I-less S.pombe
In the eukaryotic evolution of Complex I, new subunits have been added “all over” the complex Gabaldon et al, J. Mol. Biol 2005
Eukaryotic evolution of Complex I contrasts in which individual subunits have been added to a growing complex contrasts with prokaryotic evolution in which separate, multi protein complexes appear to have been assembled (T. Friedrich). An explanation for this contrast is the “operon” genome organization of prokaryotes, which facilitates the duplication of sets of interacting proteins.
Is this variation in subunits the exception or the rule for functional modules? ribose phosphate metabolism (not cohesive at all) peptidoglycan biosynthesis pathway (highly cohesiveness, far from perfect) Very few functional modules are perfect; limited cohesiveness; functional units vs evolutionary units
Non-orthologous gene displacement/analogous proteins Not specific to the “genome” age, but research into this topic has increased dramatically with the availability of complete genomes. (people would encounter “missing links”, and start hypothesizing about what could fill up this gap) First systematic analysis on M.genitalium (Koonin et al., Trends Genet. 1997)
The opposite of co-occurrence:anti-correlation / complementary patterns: predicting analogous enzymes Genes with complementary phylogenetic profiles tend to have a similar biochemical function. A B A B
Complementary patterns in thiamin biosynthesis predict analogous enzymes
(recent) Gene Duplication • fate after duplication: neofunctionalization or subfunctionalization • GO process / molecular function / cellular component • Substrate vs catalytic site / mechanism
subfunctionalization: example in terms of protein complexes (=GO cellular component)
neofunctionalization: example in terms of protein complexes (=GO cellular component)
Sub vs neo in regulatory context OLD VIEW NEW VIEW b Moore and Purugganan 2005
An example of a metabolic Pathway: Histidine Metabolism (including biosynthesis) in KEGG
Pathway evolution: • How to evolve a complex thing, when the intermediates don’t make sense See the discussion regarding the evolution of the eye. • Pathway evolution occurs at two levels: • which substrate will be turned into which product • Get the proteins to catalyze the required reactions
Model of Horowitz (1945): “Retrograde evolution” (Back propagation by gene duplication within the pathway) • Given a good “soup”, first evolve the enzyme for the last step of the pathway (the other intermediates are in the soup) • Secondly, as the substrate of the last step is the product of the preceding step, the enzymes need similar binding sites duplicate the gene encoding the last step to evolve the last minus one step • Iterate step 2
enzyme substr. End prod. Gene duplication time Gene duplication Horowitz model of pathway evolution
We have data !! (no time machine), but we can test whether homologous proteins tend to cluster in pathways. Some pathways do display such clustering.e.g. Tryptophane, Histidine biosynthesis contain subsequent steps catalyzed by homologous proteins
Homologous proteins are overrepresented at short distances within pathways, supporting the Horowitz model.
Alternative theory of pathway evolution: Jensen, 1976: Enzyme recruitment in evolution of new function Primordial enzymes were multifunctional (“substrate ambiguity”) Ordered pathways were evolved from these enzymes by gene duplication followed by specialization (recruitment)
How many proteins are really multifunctional ? • Example: finding the fructose 1,6 biphosphate phosphatase in the Archaea • Stec B, Yang H, Johnson KA, Chen L, Roberts MF. • MJ0109 is an enzyme that is both an inositol monophosphatase • and the 'missing' archaeal fructose-1,6-bisphosphatase.Nat Struct Biol. 2000 Nov;7(11):1046-50. • A number of multifunctional are being discovered but the question remains whether multifunctional enzymes played a larger role in early evolution
Structural assignments and sequence comparisons were used to show that 213 domain families constitute approximately 90% of the enzymes in the small-molecule metabolic pathways. Catalytic or cofactor-binding properties between family members are often conserved, while recognition of the main substrate with change in catalytic mechanism is only observed in a few cases of consecutive enzymes in a pathway. Recruitment of domains across pathways is very common, but there is little regularity in the pattern of domains in metabolic pathways. This is analogous to a mosaic in which a stone of a certain colour is selected to fill a position in the picture.(Teichmann et al., 2001) • Pathway evolution operates mainly by recruitment, not by Horowitz’ retrograde evolution. • (notice that this is not so surprising, given what we learned on day 2: • Substrate specificities are relatively volatile aspects of the enzyme evolution, catalytic function is much better conserved the “conservation of substrate binding, evolution of catalytic function” argument is not really what one encounters in present day evolution • This does not necessarily support the Jensen theory of substrate ambiguity.
Pathway duplication: co-duplicate multiple functional interacting proteins to together take a place in a new pathway. Pathway duplication
Pathway duplication at the protein level: homologous (sometimes identical) proteins are used to catalyze a chain of similar reactions propionate acetate ATP + CoA propionyl-CoA synthase acetyl-CoA synthase AMP + PPi propionyl-CoA acetyl-CoA H2O + oxaloacetate 2-methyl citrate synthase citrate synthase CoA 2-methylcitrate citrate H2O aconitase acinotase + prpD 2-methylisocitrate isocitrate 2-methyl isocitrate lyase isocitrate lyase succinate pyruvate succinate glyoxylate
Lys20 homologous to LeuA, not GltA HacAB homologous to LeuCD, Acn PH1722 homologous to icd, LeuB Pathway duplication between (methyl)citric acid metabolism and Amino-Acid biosynthesis (Lysine, Leucine)
Methods: define paraCOGs All vs. all profile-profile searches create HMM profiles all COGs & NOGs Align MSAs HMMs Raw output (HHsearch) (Muscle) (HHmake) Assign homology
Iterative module subclustering Specific functional modules CFinder Methods: define functional modules Functional module: primary building block of biomolecular systems, i.e. metabolic or signaling pathway or protein complex Functionally linked COG pairs, recalculated for genomic context links only (npf) ‘Rough’ functional modules Clustering all COGs & NOGs STRING dataset CFinder
Duplication of NqrDE/RnfAE occurred prior to module duplication
Reconstruction of the evolution of the NQR-RNF reductases Redox-driven Na+-pump Reductase of proteins involved in nitrogen-fixation • Sub-functionalization on the protein complex level
Pathway duplication is prevalent in signalling, transport pathways. (The evolution of the MAP kinase pathways: coduplication of interacting proteins leads to new signaling cascades.Caffrey DR, O'Neill LA, Shields DC. J Mol Evol 1999 Nov;49(5):567-82)
Pathway duplication in signaling pathways is: 1) Easy because one does not have to change the substrate specificity 2) Hard because one does not want too much crosstalk… Is it one duplication of the entire pathway or stepwise duplication?