740 likes | 862 Views
Topic 8. Lecture 12. Generalizations emerging from past evolution History unfolds in time, which makes chronology of past events crucial. However, any history also has a crucial timeless aspect, which can be described by generalizations.
E N D
Topic 8. Lecture 12. Generalizations emerging from past evolution History unfolds in time, which makes chronology of past events crucial. However, any history also has a crucial timeless aspect, which can be described by generalizations. An example of generalizations: complexity is rapidly lost, if selection stops maintaining it. A parasitic plant Epifagus virginiana lost many key genes in its chloroplast genome Mycobacterium leprae is in the middle of massive genome degeneration Astyanax mexicana, like many other cave animals, has degenerated eyes A crustacean parasite of fish, Lernaea carassii, has profoundly simplified morphology In a sense, every feature of living beings, both modern and ancient, is a generalization about evolution of their ancestors.
Let us view evolutionary generalizations from three complementary perspectives: 1. Generalizations concerned with evolution at a particular level of organization of life - i. e., with sequences, molecules, cells, organisms, populations, and ecosystems. 2. Generalizations concerned with evolution of the diversity of life. Such generalizations describe patterns in the diversity of life at one moment of time, as well as processes that generate such diversity, i. e., evolution of individual lineages, birth and death of lineages, independent and dependent evolution of different lineages, and evolution in space. 3. Generalizations concerned with evolution of complex adaptations, the most enigmatic aspects of evolution. These generalizations describe genotypical and phenotypical mechanisms of adaptive evolution, origin of novel adaptations, and dynamics of complexity Because we still lack a comprehensive theory of Macroevolution, generalizations about past evolution often are all what we have.
Level-specific generalizations: 1. Sequences a) Mutation strongly affects sequence evolution, and selfish segments are common b) Functionally important segments and sites of genomes usually evolve slower c) Complex organisms have larger genomes, mostly due to noncoding sequences 2. Molecules a) Life possesses fundamental unity b) A particular function can be performed by very dissimilar molecules c) Rates of evolution vary across sites of a molecule and often change with time 3. Cells a) Networks within a cell are modular b) Networks within a cell consist of a small number of common motifs 4. Multicellular organisms a) Cell differentiation involves combinatorial regulation of gene expression b) In the development of vertebrates one stage is particularly conservative c) Body size often increases, but declines on islands 5. Populations a) Reproduction almost always involves unicellular channels b) Amphimixis is pervasive 6. Ecosystems a) Natural ecosystems can be successfully invaded
Generalizations concerned with diversity of life: 1. Diversity of life at a particular moment of time a) Every individual belongs to a population of at least ~1000 individuals b) At any moment, life mostly consists of compact, disconnected forms c) Genotypes are incompatible if the distance between them exceeds ~1-5% 2. Evolution of a lineage a) Changes of a lineage are continuous, with some caveats b) Genomes evolve at much more uniform rates than phenotypes 3. Birth and death of lineages a) Cladogenesis is often, but not always, triggered by geographic isolation b) Cladogenesis and extinction are extremely unfair processes c) Overall diversity of life fluctuates, with the long-term tendency to increase 4. Independent evolution in multiple lineages a) Evolution is predominantly divergent, but homoplasy is common in simple traits b) Independent evolution eventually leads to speciation 5. Coevolution a) Lineages often coevolve for a long time b) Organisms often imitate each other to avoid been eaten 6. Diversity in space a) Distributions of ranges of species are strongly affected by limited dispersal b) Independent evolution at different localities is often parallel
Generalizations concerned with adaptation and complexity: 1. Genetical aspects of adaptive evolution a) Evolution of both coding and non-coding sequences is important for adaptation b) The target for strong positive selection is narrow at each moment c) Tightly related genes can perform rather different functions 2. Phenotypic aspects of adaptive evolution a) Adaptations can be very general and very specific b) Evolution is irreversible c) Perhaps, all adaptations are imperfect 3. Origin of novelties a) New non-coding regulatory sites, but not new genes, often appear from scratch b) Origin of phenotypic novelties is usually opportunistic and can happen fast 4. Dynamics of complexity a) Complex phenotypes evolve through adaptive intermediate stages b) Complexity is rapidly lost, if selection stops maintaining it c) The overall trend is for complexity to increase
Level-specific generalizations: 1. Sequences The level of sequences is the simplest of all levels of organization of life. ACGATCGACGACGATCGATCGACGATCGA Green, blue, red: targets of no, negative, and positive selection. Evolution of sequences is undestood relatively well. The two key factors of Darwinian evolution, mutation and selection, are its main forces. However, this is of little help for understanding evolution at higher levels. Whether genotypes drive evolution of phenotypes or it is the other way around is a classical chicken-and-egg problem.
1a) Mutation strongly affects sequence evolution, and selfish segments are common This sweeping generalization has many facets. The three most important of them are: i) Evolution of sequences proceeds through individual changes that are supplied by mutation process, first of all by point mutations - single nucleotide substitutions, and short deletions and insertions. Sister 1: caagccag---cgtctatcatatacgcagactcggctatttacgccacgatcagcat Sister 2: catgccagcatcgtctagcatatacacagactc-gctatttacgtcacga-cagcat Outgroup: catgccagcatcgtgtagcatataggcagactc-gctaattacgtcacgatcagtat del. in. del. ii) Long new sequences have identifiable sources, instead of appearing from scratch. Tandem duplication, the simplest manifestation of this pattern. acagcatcgtgactagctatcgagatca -> acagcatcgtgactagctatagctatcgagatca iii) Different genome regions evolve at similar overall rates. This is another theory-based evidence for evolution. Human-mouse divergence at synonymous sites of genes on chromosomes 4 (left) and 22 (right). Simple explanation: when selection does not care, mutation reigns.
One important special case of this generalization is that transposable elements (TEs) accumulate in many genomes. A mammalian genome is ~50% TEs, a Drosophila genome is ~10% TEs, and bacterial genomes usually contain very few TEs and other junk. In mammals, individual TEs are usually fixed, i. e. present in every genotype within a lineage. In Drosophila a individual TE is usually rare. Mammals Drosophila Often, TEs or their segments become domesticated, i. e. start performing some function for their host. The distribution of ages of TEs in the human genome. This is measured by divergence from the consensus sequences and grouped into bins that correspond to 25My of divergence.
1b) Functionally important segments and sites of genomes usually evolve slower A nucleotide substitution can kill, but at another location a substitution of even a removal of 1Mb of sequence has no evident impact on the phenotype. This sweeping generalization has many facets. The three most important of them are: i) Non-synonymous sites of coding genes evolve slower than synonymous sites ATG TCT GGG CGA GGT AAA GGT GGC AAG GGG CTG GGT AAG GGA GGC GCC AAG CGC CAC CGG ||| ||| || ||| || ||| || ||| || ||| || || || || ||| || || ||| || || ATG TCT GGA CGA GGC AAA GGC GGC AAA GGG CTC GGA AAA GGT GGC GCT AAA CGC CAT CGT An extreme case: the first 20 codons of histone 4 genes from human and zebra-fish genomes. On average, nonsynonymous substitutions accumulate ~10 times slower than synonymous substitutions. ii) functional non-coding segments evolve slower than junk segments Alignment of four genome regions upstream of the transcription start of apolipoprotein gene. The binding site of the key transcription factor (protein) is conserved (sequence motif) and highlighted.
iii) exons evolve slower than introns Coding exons evolve much slower than introns, and this pattern can be used to determine exon locations by genome comparions. Alignment of human (top) and mouse (bottom) orthologous genes. Lines connecting the genomes show segments where their similarity is moderate (blue) or high (red). Red boxes below the alignment show predicted exons. Simple explanation: Negative (purifying) selection which favors already-commom variants and prevents changes is much more common than positive (Darwinian) selection which favors initially rare variants and promotes changes. One may wonder why beneficial mutations happen at all.
1c) Complex organisms have larger genomes, mostly due to noncoding sequences Genomes of complex organisms carry only a slightly elevated number of protein-coding genes. In Drosophila, ~50% of its non-coding DNA is apparently doing something, and in mammals this fraction is ~10%. Organisms Minimal Genome size Number of genes Maxiaml coding fraction (millions) (thousands) (per cent) parasitic bacteria 0.5-1.5 0.5-1.5 85 free-living bacteria 2.5-7.5 2.5-7.0 85 unicellular eukaryotes 10-30 7-10 50-70 flowering plants 60-120 20-30 25-40 most of animals 100-200 15-25 15-20 fishes 400-1000 20-30 5-10 birds 1000-1500 20 2-3 mammals 2500-3500 20 1.5-2 Simple explanation: Complex organisms need more text to describe themselves, and the extra text comes in the form of functional non-coding sequences (we do not really understand why). Also, complex organisms have "bloated", instead of "lean", genomes.
Level-specific generalizations: 2. Molecules Molecules are the lowest functional level. A molecule is a (relatively) small but fully functional entity, and each one is incredibly complex (protein folding remains a mystery). DNA as a functioning molecule. tRNA, a non-coding RNA. Hemoglobin, a protein. Why do we ignore evolution of things like this? Discontinuous, and its connection to the genotype is much more complex.
2a) Life possesses fundamental unity This unity is most striking, as far as translation machinery is concerned - genetic code, components of ribosomes, tRNAs, aminoacyl tRNA synthetases, etc. 80S ribosome of 70S ribosome of Saccharomyces Escherichia cerevisiae coli Many proteins not involved in translation are also universal - almost 50% of E. coli proteins have homologs among human proteins. Simple explanation: Many key features of life are probably forozen accidents, impossible to modify. This allows us to learn something about LUCA.
2b) A particular function can be performed by very dissimilar molecules Despite the fundamental unity of life, there are some cases when the same function, either simple or complex, is performed by clearly non-homologous molecules, similar only to the extent dictated by this function. Inorganic pyrophosphatases comprise two non-homologous families, I and II. Archaeal and eukaryotic replicative DNA polymerases (families A and B) and bacterial replicative DNA polymerases (family C) perhaps are non-homologous. Simple explanation: Apparently, this is a general law of nature: every complex task can be performed more or less equally well in many rather different ways. Without common ancestry, each species would probably use its own way hydorolize pyrophosphate.
2c) Rates of evolution vary across sites of a molecule and often change with time In almost every RNA or protein molecule there are sites that evolve very conservative and sites that evolve as fast as junk DNA (i. e., at mutation rate) or even faster. A typical segment of an alignment of several orthologous proteins from different species.
Distribution of amino acid replacements along the Neisseria gonorrhoeae transmembrane porin sequence. Each dot represents one replacement. Obviously, sequence segments exposed outside the cell evolve much faster, probably due to positive selection. Simple explanation: a site can be under negative selection, no selection, or positive selection.
The rate of evolution within a molecule is not only heterogeneous across sites at any moment of time, but it also can change at a particular site while the molecule evolves. This occasionally includes the most drastic, qualitative changes - a nucleotide or amino acid replacement which was forbidden by selection may become permitted, or other way around. Hs 1 MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTK 60 Ag 1 MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVTTVAEKTK 60 Hs 61 EQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDP 120 Ag 61 EQVTSVGGAVVTGVTAVAQKTVEGAGNIAAATGFVKKDHSGKSEEGAPQEGILEDMPVDP 120 Hs 121 DNEAYEMPSEEGYQDYEPEA 140 Ag 121 DNEAYEMPSEEGYQDYEPEA 140 In humans, T at the 53rd site of a protein alpha-synuclein is pathogenic. However, in spider monkey normal alpha-synuclein contains this T. Probably, some other deviation of spider monkey alpha-synuclein from its human ortholog renders T at the 53rd site harmless. Thus, we can call this T a compensated pathogenic deviation (CPD). As many as 10% of deviations of a non-human protein from its human ortholog would be deleterious, if placed into the human molecule individually.
CDPs are very common in tRNAs. Three of them are present in mitochondrial tRNASer of Ursus maritimus (polar bear). Nucleotides corresponding to human pathogenic mutations are shown in red; predicted compensatory substitutions are shown in blue; and other deviations from the human ortholog, those unrelated to the pathogenic mutations or their compensations, are shown in green. Nucleotides found in healthy humans are shown in orange alongside the nonhuman sequence. At least five mechanisms of compensation are known for pathogenic mutations that destroy a Watson-Crick pair in one of the four tRNA stems.
Level-specific generalizations: 3. Cells Cells is not the lowest functional level - molecules is - but it is the first living level. Thus, in cells we encounter a staggering degree of complexity. Unicellular green alga Acetabularia is ~5cm tall A ciliate Stentor Human Hippocampal neuron A cell contains a large number of functional units - promoters, mRNAs, ribosomes, and proteins. These units interact with each other, forming networks. Networks that describe the following 3 processes are particularly important: transcription of genes physical interactions functional interactions of proteins of proteins
Transcriptional regulatory network of the Saccharomyces cerevisiae. Transcription factor genes are green, regulated genes are brown, and those with both functions are red.
Network of protein complexes in S. cerevisiae. Different functions are shown by colors. The gray edges connect complexes that share protein components. Exemplar complexes from each function are expanded to show individual proteins.
A standard map of biochemical pathways, representing the metabolic network of the cell. Networks of interacting units (produced by evolution!) is the essense of cells. But can we formulate any useful generalizations about them?
3a) Networks within a cell are modular Modularity of a network simply means that interactions between some components are tight and other interactions are loose. Complexes of physically interacting proteins are modules.
The genome of yeast Saccharomyces cerevisiae encodes ~7,000 proteins. Within the cell, they form ~700 different complexes of physically interacting proteins. Such complexes are modules, but this is not the whole story. Often, a protein can participate in several complexes. Some proteins always stick together and form "cores". Other proteins form "submodules" that can attach to different cores. A protein complex, consisting of the core, 3 submodules, and other attachments.
Modularity is also pervasive in transcruption and metabolic networks. Transcription factors (boxes) separately regulate genes involved in different processes.
Modules, associated with different functions, in the metabolic network in Escherichia coli. Hierarchical organization of modularity in metabolic networks. Simple explanation: Well, nothing is going to be simple here! We only might assume that, perhaps, networks within cells are modular because such networks are evolvable and designable, and not because they are optimal.
3b) Networks with a cell consist of a small number of common motifs Network motifs are patterns of interconnections that recur in many different parts of a network. All networks within cells consist mostly of a small number of motifs that evolved independently. Much of the network of transcriptional interactions in Escherichia coli is composed of repeated appearances of three motifs. Each motif has a specific function in determining gene expression. Feedforward loop: a transcription factor X regulates a second transcription factor Y, and both jointly regulate one or more operons Z1...Zn. Example of a feedforward loop (L-arabinose utilization).
SIM motif: a single transcription factor, X, regulates a set of operons Z1...Zn. X is usually autoregulatory. All regulations are of the same sign. No other transcription factor regulates the operons. Example of a SIM system (arginine biosynthesis). DOR motif: a set of operons Z1...Zm are each regulated by a combination of a set of input transcription factors, X1...Xn. DORs are detected as dense regions of connections. Example of a DOR (stationary phase response).
The most common motif in metabolic networks that regulate enzyme activity: negative feedback loop. Again, such loops evolved independently very many times in different metabolic pathways. Simple explanation: Apparently, there are not too many feasible solutions for each of the simple regulatory tasks that a part of the network has to perform. We may be dealing with unique optimality here, as far as the overall structure of regulatory interactions is considered.
Level-specific generalizations: 4. Multicellular organisms A cell is alive, but often cells are not independent. Multicellular organisms evolved from unicellular five times. Multicellular organisms are as complex as constituent cells, if not more. Obviously, cell differentiation, pattern formation, and overall properties of organisms are all essential. Cell differentiation Pattern formation Overall phenotype
4a) Cell differentiation involves combinatorial regulation of gene expression The genome of a multicellular organism programs development of many different cell types, although it contain only slightly more genes that the genome of a unicellular organism. Greater complexity of multicellular organisms appears because, on average, their genes are regulated by a much larger number of transcription factors. a, Simple eukaryotic transcriptional unit. A simple core promoter (TATA), upstream activator sequence (UAS) and silencer element. b, Complex metazoan transcriptional control modules consisting of multiple clustered enhancer modules interspersed with silencer and insulator elements. Simple explanation: Opportunism or optimality? - we do not know.
4b) In the development of vertebrates one stage is particularly conservative The embryonic development of all vertebrates shows remarkable similarities at the early - but not the earliest - stage called the pharyngula. At this stage all vertebrates have notochord, dorsal hollow nerve cord, post-anal tail, and a series of paired branchial grooves, matched on the inside by a series of paired gill pouches. The pattern is known since XIX centry, as Von Baer's law. Simple explanation: Perhaps, early stage of development are less evolvable, because their changes afect all subsequent stages.
4c) Body size often increases with time, but declines on islands This pattern is known as Cope's rule, and has been observed repeatedly. Larger animals are apparently more prone to extinction. Body size is plotted against time, for species of Borophaginae (a clade of extinct carnivors). Clearly, this pattern cannot be universal! Indeed, there are many exceptions. In fact, on islands body size of many - but again not all - organism declines, a pattern known as Foster's Rule.
The Pygmy Mammoth (Mammuthus exilis) was a dwarfed descendant of full-sized mammoths that lived on an island known as Santa Rosae. Wrangel island - the range of another dwarfed mammoth, extinct only ~3,500ya. Skeleton of a Cretan Dwarf Elephant. Simple explanation: Clearly, this is a mess!
Level-specific generalizations: 5. Populations Here the complexity of our object drops again - a part may be more complex than the whole, if we can view parts as black boxes. Populations are sets of similar individuals, and we care only about those properties of inviduals that describe them as members of such sets, without looking under the hood. Organism Individual Population of individuals Complexity of life peaks at cells and organisms - lower and upper levels are simpler.
5a) Reproduction almost always involves single-cell channels Why to recreate big organisms from single cells, every generation? Some other examples of vegetative reproduction Even when reproduction is nominally vegetative - for example, a branch of a moss becomes an independent organism - all the cells of this branch may originate from a single apical meristemal cell. Even mitochondria in female germline in mammals go through drastic bottlenecks - all mitochondia of a newborn are descendants of just 3-4 stem maternal mitochondria. Why? Simple explanation: Perhaps, single-cell channels make selection more efficient, and thus reduce the mutation load.
5b) Amphimixis is pervasive Why is such a crazy process - alternation of syngamy and meiosis - ubiquitous? Indeed, apomixis (asexual reproduction) is very common, but almost never represents the only mode of reproduction. The only known exception are bdelloid rotifers. An obligately apomictic bdelloid rotifer. Simple explanation: There is no definite explanation for the ubiquity of amphimixis. Almost 20 hypotheses have been proposed, and 3 or 4 among them make sense. We will consider this issue later.
Level-specific generalizations: 6. Ecosystems As you know, ecosystems consist of interacting populations. 6a) Natural ecosystems can be successfully invaded. Purple loosestrife, Lythrum salicaria, a very successful invader of European origin in North America. Elodea canadensis, a very successful invader of North American origin in Eurasia. An invasion present a paradox: why should an invader should be successful within the new environment, to which it never had a chance to adapt? Apparently, natural ecosystems have a lot of empty niches. Simple explanation: There are several hypotheses but none is universally accepted. Still, it is clear that the problem is an evolutonary one.
Generalizations concerned with diversity of life: 1. Diversity of life at a particular moment of time a) Every individual belongs to a population of at least ~1000 individuals b) At any moment, life mostly consists of compact, disconnected forms c) Genotypes are incompatible if the distance between them exceeds ~1-5% 2. Evolution of a lineage a) Changes of a lineage are continuous, with some caveats b) Genomes evolve at much more uniform rates than phenotypes 3. Birth and death of lineages a) Cladogenesis is often, but not always, triggered by geographic isolation b) Cladogenesis and extinction are extremely unfair processes c) Overall diversity of life fluctuates, with the long-term tendency to increase 4. Independent evolution in multiple lineages a) Evolution is predominantly divergent, but homoplasy is common in simple traits b) Independent evolution eventually leads to speciation 5. Coevolution a) Lineages often coevolve for a long time b) Organisms often imitate each other to avoid been eaten 6. Diversity in space a) Distributions of ranges of species are strongly affected by limited dispersal b) Independent evolution at different localities is often parallel
Generalizations concerned with diversity of life: As long as we are ready to ignore the complexity of life, evolution of its diversity is understood reasonably well. 1. Diversity of life at a particular moment of time 1a) Every individual belongs to a population of at least ~1000 individuals This fundamental fact is so familiar that it is often taken for granted - although it should not. Loch-Ness monster does not exist - there must be at least 1000 of them. The same is probably true for yeti. Mating ball of Garter snakes. Simple explanation:Population genetic theory demonstrates that a small population will soon become extinct due to inefficient selection against new deleterious mutations. We will consider this theory later.
1b) At any moment, life mostly consists of compact, disconnected forms Indeed, at least among multicellular eukaryotes, we often encounter "good species", i. e. compact sets of similar and compatible organisms. Often, a form of life is not very compact phenotypically, but it still compatible and connected within itself, and disconnected from other forms.
Aquilegia formosa Aquilegia pubescens Sometimes, two compatible phenotypes are connected by only a relatively small number of hybrids, so it is not clear whether to treat them all as one form of life or not. Simple explanation: There are probably several reasons behind this generalization: (i) species might be adapted to discontinuous ecological niches, (ii) Anagenesis is only rarely coupled with continuous range expansion, and that such expansion cannot be too long - because the Earth is too small. Occasionally, connection exists even between incompatible genotypes.
c) Genotypes are incompatible if the distance between them exceeds ~3- 10% Very often, incompatible genotypes are also disconnected (again, only within modern organisms). There are no living, fit intermediates between dog and cat ,or horse and donkey. nothing to show Two incompatible, disconneced genotypes - the most common situation. Two partially compatible, disconnected genotypes (mules are viable, but sterile).
Still, compatible genotypes (left and right) may be disconnected, due to geographical isolation (hybrid in the center was produced artificially). Simple explanation: There is no need to explain why incompatibility increases with dissimilarity. However, a likely reason for a rapid transition to incompatibility is nontrivial and is known as Orr's snowball. Occasionally, even incompatible genotypes remain connected.
Generalizations concerned with diversity of life: 2. Evolution of a lineage a) Changes of a lineage are continuous, with some caveats Children are similar to parents A rare exception: WDG A rare exception: symbiogenesis Simple explanation: With some exceptions, long parent-offspring leaps within the space of genotypes are just impossible: most of potential genotypes are junk, and a long leap will land you in junk.
b) Genomes evolve at much more uniform rates than phenotypes At the level of sequences, different lineages can easily accumulate changes at rates that vary within a factor of 1.5-2.0, but variation of rates is rarely large. Lengths of dog, mouse, and human branches of the unrooted phylogenetic tree in numbers of nucleotide substitutions per a synonymous (Ks) and a nonsynonymous site (Ka). Sequences evolved almost 3 times faster on the mouse branch than on the human branch - because the number of generations was much higher in the mouse branch.
In contrast, phenotypes occasionally evolve at very different rates along different branches. Simple explanation: No law of nature prescribes a constant rate of genome evolution. Thus, its approximate uniformity is something of a mystery. We will consider this issue later. Heterogeneity in rates of phenotypical evolution must be, at least partially, due to heterogeneity of strength of Darwinian natural selection.
Generalizations concerned with diversity of life: 3. Birth and death of lineages 3a) Cladogenesis is often, but not always, triggered by geographic isolation Geographical isolation always leads to unlimited divergence. However, this is not the whole story.
A lineage can also split into two even without geographic subdivision. This process is called sympatric speciation. For example, in a crater lake Apoyo in Nicaragua a new species of cichlids evolved sympatrically in the course of ~10,000 years. Amphilophus citrinellus (left) is the ancestral form and A. zaliosus (right) is a new species. Simple explanation: If a lineage is subdivided into two isoalted parts, these parts are bound to evolve independently and, eventually, will become very dissimilar, disconnected and incompatible- because evolution is primarily divergent. This is trivial. In contrast, cladogenesis without prior geographical isolation is a complex and fascinating subject, to be considered later.
3b) Cladogenesis and extinction are extremely unfair processes We already saw this may times. Simple explanation: Why should they be fair? Is life fair? Specific reasons for unfairness, however, are not clear, and may be 1) "Key innovations", 2) Ecological opportunities, 3) Chance. Questions to think about: Can we say that a clade which diversifies faster has a selective advantage over a clade which diversifies slower? Is it true that species from a more diverse clade are more advanced (derived)? Is Amborella a living fossil?