220 likes | 253 Views
Explore the intricate world of bacterial DNA mutation and evolution. Dive into the forces of natural selection and random mutations that shape genetic instructions over generations. Discover how adaptations, neutral mutations, and evolutionary conservation play essential roles in the survival and reproduction of organisms. Gain insights into the complex dynamics of DNA changes and the impact on evolutionary fitness. Unravel the mysteries of mutations and their role in the genetic evolution of species.
E N D
Mutation and Evolution • DNA carries the genetic instructions forward in time. You might consider life from DNA’s point of view: it goes through many generations of individual organism, slowly changing, occasionally having a major shift, and sometimes dying out. • DNA is affected by two forces: changes caused by random mutations of several kinds, and natural selection. To paraphrase someone*’s quote: Random mutation proposes new DNA, and natural selection disposes of unsuccessful ideas. • Ludovico Ariosto (Italian poet 1474-1533) “Man proposes and God disposes”
Evolution by Natural Selection • The basic principal of natural selection is very simple: an organism that survives and reproduces better than other organism increases its share of the next generation. • This is especially clear from the DNA point of view: There will be more copies of DNA that causes its host organisms to produce more offspring. • It is important to realize that this is a long term thing: evolutionary success requires each generation to produce more offspring (on the average. More concretely: if you have 10 children and none of them reproduce, your evolutionary fitness is zero because your DNA stops moving forward in time with your children. • Evolutionary fitness: the ability to survive and reproduce • Malthusian principle: there’s not enough room for all descendants. Resources are always limited, which implies that some individuals die without reproducing. Survival is affected by how well an individual is adapted to the environment. Or, how well the individual’s genes match the needs and challenges imposed by the environment it lives in. • adaptations are specific traits that help the organism function in its environment • What works well under one set of conditions may not work when conditions change.
Neutral Mutations • Many mutations have very little or no effect on the organism • mutations in intergenic regions • synonymous mutations: affect the gene’s DNA but not the amino acids sequence • amino acids changes in non-critical regions of the protein • Neutral mutations: do not affect the evolutionary fitness of the organism. • Many genetic polymorphisms: small differences between different strains of the same species, are (probably) neutral. Polymorphisms are very useful for strain identification and for genetic mapping • The concept of pre-adaptive mutations: DNA variations that are neutral, or have only a small effect on fitness under “normal” conditions suddenly become very useful when conditions change. • Mutations with a strong negative effect on fitness are quickly weeded out: the organism can’t survive or reproduce. But, neutral (no “nearly neutral”) mutations can stay in a population for a long time.
Evolutionary Conservation • Most of gene annotation is a matter of evolutionary reasoning: • Gene A in a new species probably has the same function as the similar gene B in another species because both species need to solve the same problem and they are related by evolutionary descent from a common ancestor. • We look at gene sequences and other features and make a decision that the differences we see are not important, and therefore assign the same function to both genes • Basic principles: what is being selected in function: how well do the genes work in the organism as it lives its life. DNA changes are conserved to the degree that they affect function. Most function is based on how well enzymes and other proteins do their job. • Protein sequence is more conserved than DNA sequence. Thus most of our sequence homology searches are conducted with protein sequences. • Three-dimensional shape, the key to enzyme function, is conserved better than protein sequence. It is quite possible to produce the same structure with completely different amino acids. • Unfortunately, it is very difficult to search 3-D structures, mainly because there is no good way to determine how an amino acid sequence will fold up. This is the “protein folding problem”, one of the major unsolved problems in bioinformatics.
More on Conservation • genes are more conserved than intergenic regions. Being very loose here, a “gene” can be considered any part of the DNA that is transcribed • however, there are some functional regions in DNA that are not transcribed but which are conserved: the origin of replication and gene control regions, for example. • Protein-coding portions of genes are conserved more than untranslated regions • the middles of proteins are conserved more than the ends. It can be hard to pinpoint the translation start of a gene because it is not well conserved between species • the amino acids that make up the active site of the enzyme are the most conserved of all, often being identical across large evolutionary distances
Mutations • Any change in the DNA sequence of an organism is a mutation. • Mutations are the source of the altered versions of genes that provide the raw material for evolution. • A central tenet of biology is that the flow of information from DNA to protein is one way. DNA cannot be altered in a directed way by changing the environment. Only random DNA changes occur. • Some terminology: the genotype is the organism’s genetic constitution, at the bottom, the sequence of its DNA. The phenotype is the physical characteristics of the organism: its appearance, biochemistry, reactions to the environment, etc. • before DNA sequencing, the genotype was deduced from the phenotypes of parents and offspring. • the point of genome annotation is to deduce the phenotype that will result from a given genotype. • Most mutations have no effect on the organism, especially among the eukaryotes, because a large portion of the DNA is not in genes and thus does not affect the organism’s phenotype. • Of the mutations that do affect the phenotype, the most common effect of mutations is lethality, because most genes are necessary for life.
Base Change Mutations • The simplest mutations are base changes, where one base is converted to another. (Also called “substitutions”, or “point mutations”.) These can be classified as either: • --“transitions”, where one purine is changed to another purine (A -> G, for example), or one pyrimidine is changed to another pyrimidine (T -> C, for example). • “transversions”, where a purine is substituted for a pyrimidine, or a pyrimidine is substituted for a purine. For example, A -> C. • Transitions are more common than transversions, because they are easier to create, and because transitions often have less drastic effects than transversions. • Base change mutations are the cause of single nucleotide polymorphisms (SNPs). Mapping SNPs is the current best way to locate human disease genes. • Base change mutations are the most common mutations, and they are the easiest to handle for statistics and evolutionary studies.
Base Change Causes • Base changes occur naturally as errors in replication: the wrong base gets inserted. • DNA polymerase has an editing function that detects most errors, then backs up, removes the wrong base and puts in the proper base. • enzymes that replicate RNA don’t have the editing function, so their error rate is 100 x that of DNA polymerase, causing the high mutation rate of RNA viruses. • Various chemical changes in a base can cause mutation. For instance, the spontaneous loss of the amino group on cytosine converts it to uracil (which will pair with A, not G). • environmental chemicals that attach bulky groups onto bases (alkylating agents) can cause the bases to be mis-read by DNA polymerase.
Phenotypic Effects of Base Changes • Mutations can be classified according to their effects on the protein (or mRNA) produced by the gene that is mutated. • 1. Silent mutations (synonymous mutations). Since the genetic code is degenerate, several codons produce the same amino acid. Especially, third base changes often have no effect on the amino acid sequence of the protein. These mutations affect the DNA but not the protein. Therefore they are called neutral mutations, mutations which should have no effect on the organism’s phenotype. • 2. Missense mutations. Missense mutations substitute one amino acid for another. Some missense mutations have very large effects, while others have minimal or no effect. It depends on where the mutation occurs in the protein’s structure, and how big a change in the type of amino acid it is. • 3. Nonsense mutations convert an amino acid into a stop codon. The effect is to shorten the resulting protein. Sometimes this has only a little effect, as the ends of proteins are often relatively unimportant to function. However, often nonsense mutations result in completely non-functional proteins. • 4. Sense mutations are the opposite of nonsense mutations. Here, a stop codon is converted into an amino acid codon. Since DNA outside of protein-coding regions contains an average of 3 stop codons per 64, the translation process usually stops after producing a slightly longer protein. • Base changes can also affect RNA initiation, splicing and termination.
More on Substitution • In addition to synonymous mutations, some amino acid changes are “conservative” in that they have little or no affect on the protein’s function. • for example, isoleucine and valine are both hydrophobic and readily substitute for each other. • other amino acid substitutions are very unlikely: leucine (hydrophobic) for aspartic acid (hydrophilic and charged). This would be a non-conservative substitution. • Some amino acids play unique roles: cysteines form disulfide bridges, prolines induce kinks in the chain, etc. • However, some amino acids are critical fro active sites and cannot be substituted. • Tables of substitution frequencies for all pairs of amino acids have been generated. BLOSUM62 Table. Numbers on the diagonal indicate the likelihood of the amino acid staying the same. The off-diagonal numbers are relative substitution frequencies.
Indels • Another simple type of mutation is the gain or loss of one or a few bases. These mutations are called indels, which is short for “insertion/deletion”. • When comparing two species it isn’t easy to tell whether an insertion occurred in one species or a deletion occurred in the other. • Indels are thought to be generated when the DNA polymerase slips forward or backward on the template DNA it is copying. • This occurs most easily in repeated sequences, but can occur anywhere. • A second cause of short indels is chemical- or radiation-induced loss of the base portion of the nucleotide. The DNA polymerase often skips right over these sugar/phosphate stumps, leaving a missing base in the resulting DNA chain.
Frameshifts and Reversions • Translation occurs codon by codon, examining nucleotides in groups of 3. If a nucleotide or two is added or removed, the groupings of the codons is altered. This is a frameshift mutation, where the reading frame of the ribosome is altered. • Frameshift mutations result in all amino acids downstream from the mutation site being completely different from wild type. These proteins are generally non-functional. • A reversion is a second mutation that reverse the effects of an initial mutation, bringing the phenotype back to wild type (or almost). • Frameshift mutations sometimes have “second site reversions”, where a second frameshift downstream from the first frameshift reverses the effect.
DNA Replication • How DNA makes copies of itself. • Involves an enzyme: DNA polymerase. • In bacteria, replication starts at a single point, the origin of replication (ori) and proceeds in both directions around the circle, meeting on the opposite side. • The DNA double helix unwinds into 2 separate strands, and a new strand is build on each old one. Thus, each new DNA molecule consists of 1 old strand plus 1 new strand. This is called “semi-conservative” replication. • DNA polymerase makes the new strands, using the old strands as a template, with normal base pairing: A with T, and G with C. • The energy for this comes from the nucleotide precursors. They all have 3 phosphates on them, like ATP, and 2 of the phosphates are removed to make the DNA. • DNA polymerase always adds new bases to the 3’ end of the new DNA strand. This makes it necessary to synthesize one strand (the lagging strand) in short pieces, then join them together. • The pieces are called Okazaki fragments. They are synthesized starting with RNA primers that are degraded and replaced with DNA in the final product. • Joining DNA pieces is done with DNA ligase
Recombination • Recombination is the breaking and rejoining of 2 DNA molecules, usually at homologous regions (=sequence is the same). • Also called crossing-over • you end up with a DNA molecule that has 2 parental molecules • DNA metabolism in all organisms includes enzymes that catalyze recombination. • Recombination seems to be essential to long term survival: you can remove bad mutations, and you can combine several good ones together in the same organism. • In bacteria, DNA must be circular to replicate. • If a linear piece of DNA recombines with the circular chromosome, there must be 2 crossovers to exchange a part of the DNA and keep the chromosome circular • if 2 circles recombine, the result is a single larger circle. The smaller circle has become integrated into the larger circle.
Sources of New DNA • Bacteria reproduce by binary fission: replicating their DNA, then splitting in half. Each cell has only 1 parent, and there is no regular sexual process. • Horizontal gene transfer, bringing in DNA from another species, is quite common: estimated 15% of genes. • Bacteria have 3 main ways of bringing in new DNA: • conjugation: direct transfer of DNA between 2 cells (although not necessarily of the same species) • transduction: transfer of DNA between cells using a bacteriophage (virus) as an intermediate • transformation: the cell takes up DNA molecules from the environment
Mutation Caused by Recombination • Most recombination simply breaks and reattaches DNA sequences from 2 parents without changing them. • However, one possible outcome of the recombination event, “gene conversion” causes only a very short stretch of DNA to be altered: as if a very short region of the DNA from parent A is altered to be like parent B. • Recombination within a single DNA molecule can also occur, if the two regions of the DNA are similar • if the matching regions are inverted relative to each other, recombination inverts the area between them. • If the matching regions are in the same orientation, the whole region can be deleted. • Misalignment during recombination: unequal crossing over can cause genes to duplicate into tandem arrays. Very common in eukaryotes, but also happens in bacteria.
Transposable Elements • Transposable elements are DNA sequences that move from place to place in the genome. Unlike genes, transposable elements don’t have a fixed location on the chromosome. • Transposable elements are essentially parasites. In general they don’t contribute to the evolutionary fitness of the organism. • Most of the genes in an organism are necessary, at least under some circumstances, for the organism’s survival. Genes avoid being destroyed by random mutations because individuals with mutated genes are less fit: don’t survive or reproduce as well as unmutated individuals. • Transposable elements avoid being destroyed by increasing their numbers by enough to keep some functional copies present even if some are destroyed. • However, too much increase in numbers will kill the organism because sometimes transposable elements insert within a gene, inactivating it.
More Transposable Elements • Two basic types: those that are strictly DNA, and those that replicate through an RNA intermediate. • Most bacterial TEs are DNA only • Most common type: Insertion Sequences (IS) • roughly 1-3 kbp long, containing a transposase gene, and are bounded by short (10-40 bp) inverted repeats • many different families, not well conserved across species • Transposons are longer TEs, usually composed of 2 IS elements and a gene(s) in between, often an antibiotic resistance gene. • RNA transposable elements are called retrotransposons in eukaryotes. • In bacteria, the common RNA TE is a “group II intron”. • When transcribed into messenger RNA they can splice themselves out without the need for proteins • group II introns contain a gene for reverse transcriptase, which copies the RNA back into DNA at a new location in the genome.
Integrons • Recently discovered in Gram negative bacteria. Involved in the spread of antibiotic resistance. • Contain a gene for integrase, a recombination site called attI, a strong transcription promoter, and a set of gene “cassettes” that code for drug resistance. • the most common type also has a sulfonamide resistance gene (sulI) at the 3’ end. • Cassettes exist as small DNA circles that don’t replicate or get transcribed. They contain a corresponding att site. When teh stt sites are aligned, integrase catalyzes a recombination event and incorporates them into the integron (or removes them) • Found in variable locations in the genome.
Lysogenic Bacteriophage • Bacteriophage (phage) are bacterial viruses: DNA (or RNA) surrounded by a protein coat, but with no internal metabolic activity. • Most bacteriophage enter the cell, hijack its machinery to reproduce themselves, and then kill the cell by lysing it (breaking it open). This is called the lytic cycle. • Some phage have the ability to insert themselves into the bacterial genome and remain there, inactive, for many generations: the lysogenic cycle. • First described in phage lambda • the inserted phage chromosome is called the prophage. • When conditions get harsh, the phage DNA comes out of the chromosome and enters the normal lytic pathway. It reproduces and kills the host cell. • Sometimes the prophage is inactivated by mutation and becomes a permanent part of the chromosome.
Chromosome Breaks • DNA sometimes breaks due to mechanical stress, ionizing radiation, or chemical attack. • Most organisms contain enzymes that reassemble broken DNA molecules, called non-homologous end joining. • If there is more than one break, ends are joined randomly, which can lead to a rearranged genome. • This breaks up blocks of genes over evolutionary time