670 likes | 684 Views
Dive into understanding phylogenetic trees, alleles, genes, natural selection, mutations, and genetic drift. Discover the genetic intricacies that shape biological diversity and evolution.
E N D
Today’s Agenda • John M.’s presentation (15-25 min) • Phylogenetic Trees • Overview • Construction • Algorithms • Etc.
Phylogenetic Trees • The means by which biologists portray • the history of life over millions of years and • the branching events that gave rise to biological diversity
Phylogenetic Trees • Phylogenetic trees of different alleles of a particular gene are the byproduct of • many rounds of mutation, • drift and • selection • resulting in nearly unique sequences at that allele for individuals within species.
Questions You Should Be Asking: • What are Alleles? • What exactly is mutation; what causes it? • What is drift? • What is selection?
Alleles & Genes • Remember a gene is a segment of DNA that ultimately encodes a protein • A protein that performs an important biological function. • Or leads to some other important trait • Very similar organisms have the same genes • For example, we all have the gene for hemoglobin • However, a slight change in that gene might result in sickle cell anemia • Another slight change might not cause any effect • A severe change might lead to a complete failure to create hemoglobin (i.e., death) • Different versions of a gene are called alleles.
Natural Selection • A prime objective for all species is to reproduce and survive, • When species do this they tend to produce more offspring than the environment can support. • The lack of resources to nourish these individuals places pressure on the size of the species population, and • the lack of resources means increased competition and as a consequence, some organisms will not survive.
Natural Selection • The organisms who die as a consequence of this competition were not totally random, • Darwin found that those organisms more suited to their environment were more likely to survive. • Those organisms who are better suited to their environment exhibit desirable characteristics, which is a consequence of their genome being more suitable to begin with.
Natural Selection • As a particular species spreads over a large geographic area, genetic branches arise. • Different areas (geographic regions) provide different selection criteria • A sub-population of a species might find itself isolated in tropical environment • While another sub-population get isolated in a desert environment
Mutation • A mutation or polymorphism is a change in the DNA "letters" of a gene or an alteration in the chromosomes. • Most DNA variation is neutral (not beneficial or harmful), • But harmful sequence changes sometimes do occur. • Changes within genes can result in proteins that don't work normally or don't work at all. • Some of these changes can contribute to disease or affect how someone responds to a medicine.
Mutation • Mutations • may be passed down from parent to child (in the sperm or egg cells), • may occur around the time of conception or • may be acquired during a person's lifetime. • Can arise spontaneously during normal cell functions • when a cell divides, or • in response to environmental factors such as toxins, radiation, hormones, and even diet.
Mutation • Nature provides us with a system of finely tuned repair enzymes that find and fix most DNA errors. • But as our bodies change in response to age, illness and other factors, our repair systems may become less efficient. • Uncorrected mutations can accumulate, resulting in nasty stuff.
Genetic Drift • Allele frequencies can change due to chance alone. • Alleles that form the next generation's gene pool are a sample of the alleles from the current generation. • When sampled from a population, the frequency of alleles differs slightly due to chance alone. • A small percentage of alleles may continually change frequency in a single direction for several generations • just as flipping a fair coin may, on occasion, result in a string of heads or tails.
Parent Population Next Generation Next Generation Genetic Drift
Genetic Drift • Sharp drops in population size can change allele frequencies substantially. • When a population crashes, the alleles in the surviving sample may not be representative of the pre-crash gene pool. • This change in the gene pool is called the founder effect, because small populations of organisms that invade a new territory (founders) are subject to this. • Many biologists feel the genetic changes brought about by founder effects may contribute to isolated populations developing reproductive isolation from their parent populations.
Genetic Drift • The founders effect Invaders LargePopulation Small subsetSurvives
Genetic Drift & Fitness • Large populations are often divided into smaller subpopulations. • Drift causes allele frequency differences between subpopulations • If a subpopulation is small enough, the population could even drift through fitness valleys in the adaptive landscape. • Then, the subpopulation could climb a larger fitness hill.
Genetic Drift & Fitness • Both natural selection and genetic drift decrease genetic variation. • If they were the only mechanisms of evolution, populations would eventually become homogeneous and further evolution would be impossible. • There are, however, mechanisms that replace variation depleted by selection and drift. • Thank God for mutation and environmental diversity.
Trees and Distance • http://babbage.clarku.edu/~djoyce/java/Phyltree/intro.html
Reconstructing Phylogenetic Trees • There are ten extant species (species currently living) • named from 1 through 10. • The lines above the extant species represent the same species, just in the past.
Reconstructing Phylogenetic Trees • When two lines converge to a point, that should be interpreted as the point when the two species diverged from a common ancestral species • the point being the common ancestral species.
Reconstructing Phylogenetic Trees • horizontal dimension doesn't mean anything! • It is completely arbitrary whether a branch of the tree is placed to the left or to the right
Reconstructing Phylogenetic Trees • vertical dimension corresponds to time. • Although its imprecise, the difference between two species can be used to estimate when they diverged.
Reconstructing Phylogenetic Trees A tree isn't always the best model. Here are some times when it isn't best. • For individuals within a species. The genetic material of an individual doesn't derive from a single earlier existing individual. • Animals and plants that multiply by sexual reproduction receive half their genetic material from each of two parents, so a tree like this is inappropriate.
Reconstructing Phylogenetic Trees Here are some other examples • For closely related species. Individuals do occasionally mate between closely related species, and their progeny survive to contribute to the gene pool of one or both of the parent species. • Hybrid species. In the plant world it occasionally happens that a new tetraploid species arises from two diploid species. The two parent species need to be somewhat related for this to happen.
Reconstructing Phylogenetic Trees Here is one last example: • Distant interaction. There are a couple of ways that genetic material from one species can find its way into unrelated species. • Sometimes a bacterium of one species can ingest the genetic material of a bacterium of another species and incorporate part of it into its own genetic material. • Sometimes viruses can inadvertently transport genetic material from one species to another. In spite of these exceptions, a tree model is usually a pretty good model to show the relations among species.
Mutation Rates & Vertical Dimension • Differences among species are the key to reconstructing the phylogenetic tree. • Species differ in the characteristics, also called characters. • The characters may be observable and measurable properties of the individuals. • For instance, among mammals, the numbers of the different kinds of teeth that the individuals of the species have has been a successful character to classify mammals. • This character has been especially important among extinct species since fossilized teeth are commonly found.
Mutation Rates & Vertical Dimension • Any characters can be used to classify species and reconstruct a phylogenetic tree of species, • but some are more useful than others. • If a species depends on a character for its continued survival, that character will not change as any mutations of it will be eliminated. • Call such characters essential. And most visible characters are essential for the species. • This means that if we choose essential characters, any differences should count as very significant.
Mutation Rates & Vertical Dimension • There are, however, some difficulties with considering essential characters. • If one species evolves by changing an essential characteristic, whatever ecological forces supported that change may also apply to other species, and that could lead to parallel evolution. • Thus, differences or similarities in essential characters are very relevant to the reconstruction of the general shape of the phylogenetic tree, but they really can't be used to determine the relative lengths of the lines within the tree. • Some species have been stable for millions of years. Others evolve very fast.
Mutation Rates & Vertical Dimension • Irrelevant mutations. We could, on the other hand, consider nonessential characters. • Changes in nonessential characters are effected by mutations, mutations that we can call irrelevant. • The rate of change of irrelevant mutations should be fairly uniform among species, especially among species that are fairly closely related.
Mutation Rates & Vertical Dimension • Much of the genome sequence of an organism is irrelevant. • For example, there are 64 (43) different codons for 20 amino acids. • Some amino acids are coded by up to four different codons. • For these multiply coded amino acids, typically the third nucleotide can take any of the four possible values. • In other words, a mutation in this third nucleotide is irrelevant. The DNA can mutate at this site and the resulting protein doesn't change.
Mutation Rates & Vertical Dimension By concentrating on irrelvant mutations, • not only can the shape of the phylogenetic tree be reconstructed, but • the relative lengths of the lines within the phylogenic tree can also be estimated.
Mutations as a measure of time • Let's concentrate on one character to begin with. • Our first questions are: • What is the probability p(t) that the character has some value at the beginning of a time interval of length t as it does at the end? • What is the probability q(t) that the character has one value at the beginning of a time interval of length t but a different value at the end of the interval?
Mutations as a measure of time • Suppose that there are m different possible alternate values, and suppose that the mutation rate is r mutations per unit time interval. • Some statistical analysis (which we'll skip) gives us the answers to these questions.
Mutations as a measure of time • Note that initially, when t = 0, p(0) is 1, while q(0) is 0 since there are no mutations in no time. Also, as t approaches infinity, p(t) and q(t) both approach 1/m, which means that in the long run, each of the m alternative values are equally probable.
Mutations as a measure of time • Now let's assume that there are n different characters, not just one. Then E(t), the expected number of characters that are not the same at the end of a time interval of length t as they were at the beginning, is n(m –1) q(t), that is,
Mutations as a measure of time • Here's the graph of that function when there are m = 4 alternate values for each character, there are n = 40 characters, and the mutation rate is r = 0.1.
Mutations as a measure of time • Time t is shown on the horizontal axis, while the vertical axis gives y, the expected number of character differences. • Note that when t gets large, the expected number of character differences approaches 30.
Mutations as a measure of time • We can take the inverse function of y = E(t), that is, turn this graph around, to give us an estimate for time t in terms of the observed number of character differences. Let g denote the inverse function. • The base of the logarithm function here is e.
Mutations as a measure of time • The graph of t = g(y) is shown to the right with the same parameter values m = 4, n = 40, and r = 0.1. • Note that as the number of expected differences approaches 30, the corresponding time approaches infinity.
Mutations as a measure of time • The observed number of differences may be near the expected number, but it's usually more or less. • So the observed number of differences could easily be greater than 30.
Mutations as a measure of time • Should that happen, the best conclusion to make is that the time is very great, but can't be estimated. • It would be prudent not to estimate the time when the number of differences is slightly less than 30, too
Reconstruction • How do you reconstruct the phylogenic tree when all you know are characters of extant species? • When there are only a few species, only a few characters, and the number of mutations is small but not too small, then common sense and a little bit of logic does a pretty good job, at least for deciding on the shape of the tree.