720 likes | 1.37k Views
Sequence Diversity in Evolution and Crop Improvement. Teosinte. Maize Landraces. Inbreds/Hybrids. Sherry Flint-Garcia Research Geneticist USDA-ARS MU Division of Plant Sciences. Photos courtesy J. Doebley. Sequence Diversity. Evolution: What are the forces that cause evolution?
E N D
Sequence Diversity in Evolution and Crop Improvement Teosinte Maize Landraces Inbreds/Hybrids Sherry Flint-Garcia Research Geneticist USDA-ARS MU Division of Plant Sciences Photos courtesy J. Doebley
Sequence Diversity • Evolution: • What are the forces that cause evolution? • Speciation & hybridization • Uncovering evolutionary history • Crop Improvement: • The teosinte-maize story
The Four Forces of Evolution • Mutation -- spontaneous changes in the DNA of gametes. Prerequisite to all other evolution. • Natural Selection -- genetically-based differences in survival or reproduction that leads to genetic change in a population. • Gene flow -- movement of genes between populations. In plants this can be accomplished by pollen or seed dispersal. • Genetic drift -- random changes in gene frequency. This is very important in small populations.
Mutation: Generation of New Alleles • Mutations are the result of mistakes in DNA replication, exposure to UV or to some chemicals (mutagens) and other causes. • Point mutations • changing one nucleotide to another • e.g., C-->T
Sickle Cell Anemia A single point mutation causes a dramatic change in phenotype.
Other types of mutations • Indels • insertions/deletions • Cause frame-shifts, & usually premature ‘stops’ • Geneduplication • May lead to new functions • Chromosomalmutations • Inversions, translocations, deletions • Polyploidy • Very common in plants • May lead to new species in one step
Most point mutations have no effect or almost no effect. Why? Most of the genome seems to be ‘junk’ -- at least it doesn’t code for proteins. Many mutations within protein-coding region of genes don’t change the amino acid specified. i.e., there is redundancy in the genetic code. For example, 6 different codons specify the amino acid leucine.
The Four Forces of Evolution • Mutation -- spontaneous changes in the DNA of gametes. Prerequisite to all other evolution. • Natural Selection -- genetically-based differences in survival or reproduction that leads to genetic change in a population. • Gene flow -- movement of genes between populations. In plants this can be accomplished by pollen or seed dispersal. • Genetic drift -- random changes in gene frequency. This is very important in small populations.
Natural Selection • Peppered moth (Biston betularia) evolution during the industrial revolution in England • Early 1800s = pre-industrial • Bark of trees were white • Almost all moths were of typica form • 1895 = Industrial Era • Bark of trees were covered in black soot • 98% of moths were of carbonaria form • Today = Clean Air laws enforced • Prevalence of carbonaria form declining ‘typica’ form ‘carbonaria’ form
Brassica oleracea
The Four Forces of Evolution • Mutation -- spontaneous changes in the DNA of gametes. Prerequisite to all other evolution. • Natural Selection -- genetically-based differences in survival or reproduction that leads to genetic change in a population. • Gene flow -- movement of genes between populations. In plants this can be accomplished by pollen or seed dispersal. • Genetic drift -- random changes in gene frequency. This is very important in small populations.
Gene Flow • Tends to homogenize populations. • Rates of gene flow depend on the spatial arrangement of populations. “Directional” movement of alleles Migration occurs at random among a group of equivalent populations.
Migration along a linear set of populations Populations are continuous.
The Four Forces of Evolution • Mutation -- spontaneous changes in the DNA of gametes. Prerequisite to all other evolution. • Natural Selection -- genetically-based differences in survival or reproduction that leads to genetic change in a population. • Gene flow -- movement of genes between populations. In plants this can be accomplished by pollen or seed dispersal. • Genetic drift -- random changes in gene frequency. This is very important in small populations.
Founder effect: Gene flow and genetic drift are responsible for the limited genetic variation on islands, relative to mainland populations.
Speciation and Hybridization • Speciation – how do new species arise? • What is a species, anyway? • Most species were originally described by their morphology. • The Problem: Convergence • Similar features in unrelated organisms due to evolution of traits that “work” in similar environments
Convergent structures in the ocotillo (left) from the American Southwest, and in the allauidia (right) from Madagascar.
Nectar feeders have converged on this hovering long-tongued morphology.
Speciation and Hybridization • Biological Species Concept (BSC) • Based on reproductive compatibility • Natural spatial, temporal, and morphological discontinuities generally correspond to fertility barriers • The Problem: In plants, many named species can hybridize.
Most dandelions are asexual. So the biological species concept (BSC) doesn’t apply. How can you name species depending on who can mate with whom when the organisms do not mate at all?!
Scarlet and Black oaks can hybridize and inhabit the same range -- but they have different microhabitat preferences so hybridization is rare.
These pines can also hybridize but they shed their pollen at different times of the season
Speciation by Hybridization Hybridization often shows how difficult it is to apply the BSC to plants. The hybrid in this case is a new species. The rearrangements of its chromosomes make it infertile with either parent. hybrid
As the climate becomes drier the desert splits the range of this hypothetical tree species. This reduces gene flow between the now isolated populations and sets the stage for speciation.
Evolution of species that are geographically separated. Genetic drift plays a significant role. “Edge effect” where evolution of reproductive barriers occurs between neighboring populations. Requires considerable selection pressure. Establishment of a new population with a different ecological niche within the same geographical range of the parental population
Uncovering Evolutionary History • Taxonomy vs. Systematics • Estimating Phylogeny • Distance Methods • Maximum Parsimony Methods • Maximum Likelihood Methods
Taxonomy vs. Systematics • Taxonomy • Discovering • Describing • Naming • Classifying • Systematics • Figuring out the evolutionary relationships of species • Summarize the evolutionary history of a group
Plant Taxonomy • taxon - any group at any rank • corn = common name • kingdom Plantae (Viridiplantae) • division (phylum) Anthophyta • class Liliopsida • order Commelinales • family Poaceae • genus Zea • species Zea mays always capitalized never capitalized
Plant Systematics • A phylogenetic tree is used to illustrate systematicrelationships • Modern taxonomic groups generally correspond to clades on a phylogenetic tree (i.e. cladogram) • Example: phylogenetictree of the grass family Mathews et al. 2000 American Journal of Botany
Angiosperm Phylogeny Group Tree“Dicots” are not a monophyletic group.
Cross Compatibility Uses the ‘Biological Species Concept’ Morphological Continuous traits Meristic (countable) traits Cytological Chromosome number Chromosome features Pairing in hybrids Molecular data Secondary chemicals Proteins DNA Allele frequencies at many loci (isozymes, SSR) DNA sequences, considered as a whole DNA sequences, considered site-by-site Data Types that can be used to Estimate a Phylogeny
Maximum Parsimony (Minimum Evolution) Methods • The process of attaching preference to the pathway that requires the invocation of the smallest number of mutational events. • Most effective when examining sequences with strong similarity • Underlying premises: Mutations are exceedingly rare events. The more unlikely events a model invokes, the less likely the model is to be correct.
trait1 2 3 trait5 4 sp2 sp1 Species 1 red 0 1.2 A T 0<->1 3.4 Species 2 blue 0 G C 3.5 A Species 3 sp5 1 T red sp3 sp4 1 4.0 red A T Species 4 Species 5 1 2.8 blue G T Using only trait 1 … Traits must have discrete character states. Must have same character state in at least 2 taxa.
trait1 2 3 trait5 4 Species 1 red 0 1.2 A T 3.4 Species 2 blue 0 G C 3.5 A Species 3 1 T red 1 4.0 red A T Species 4 Species 5 1 2.8 blue G T But traits 3 & 4 disagree with trait 1. sp2 sp5 Red<->blue A<->G sp3 sp1 sp4
Every possible tree is considered individually for each informative site (computationally intensive). • After all informative sites have been considered, the tree that invokes the smallest total number of substitutions is the most parsimonious. 4 1 2 5 3 3 5 2 1 4 Blue Blue 0 0 G G 0 Blue 4 substitutions required 5 substitutions required G Red Red A A 1 1
Sp1 Sp2 Sp3 Sp4 Sp5 0 Sp1 0 Sp2 0 Sp3 0 Sp4 Sp5 0 Distance-based approaches Compare each taxon to every other taxon to estimate a “distance matrix” Distances are then ‘clustered’ to estimate a phylogenetic tree. d12 d13 d14 d15 d23 d24 d25 d34 d35 d45
Sp1 Sp2 Sp3 Sp4 Sp5 0 Sp1 0 Sp2 0 Sp3 0 Sp4 Sp5 0 Distance-based approaches Compare each taxon to every other taxon to estimate a “distance matrix” Example: DNA sequence considered as a whole 10 20 30 40 50Sp1: GTGCTGCACG GCTCAGTATA GCATTTACCC TTCCATCTTC AGATCCTGAASp2: ACGCTGCACG GCTCAGTGCG GTGCTTACCC TCCCATCTTC AGATCCTGAASp3: GTGCTGCACG GCTCGGCGCA GCATTTACCC TCCCATCTTC AGATCCTATCSp4: GTATCACACG ACTCAGCGCA GCATTTGCCC TCCCGTCTTC AGATCCTAAASp5: GTATCACATA GCTCAGCGCA GCATTTGCCC TCCCGTCTTC AGATCTAAAA 9 8 12 15 11 15 18 10 13 5
Sp1 Sp2 Sp3 Sp4 Sp5 0 Sp1 0 Sp2 0 Sp3 0 Sp4 4 5 Sp5 0 Distance-based approaches Distances are then ‘clustered’ to estimate a phylogenetic tree. Example: UPGMA algorithm Unweighted Pair-Group Method using Arithmetic means 9 8 12 15 11 15 18 10 13 The smallest distance is identified, the average of the two combined taxa is calculated, and the matrix is recalculated. This iteration is repeated. 5 2.5 2.5
1 4 3 5 Distance-based approaches Sp1 Sp2 Sp3 4-5 0 9 8 13.5 Sp1 11 16.5 0 Sp2 11.5 0 Sp3 0 4-5 4 4 2.5 2.5
Distance-based approaches Sp2 1-3 4-5 0 10 16.5 Sp2 12.5 0 1-3 0 4-5 4 4 5 2.5 2.5 1 3 2 4 5
Distance-based approaches 1-2-3 4-5 0 12.5 1-2-3 0 4-5 6.5 6.5 4 4 5 2.5 2.5 1 3 2 4 5
Maximum Likelihood Methods • Best suited for DNA and protein sequence data • Requires a model of evolution • Each nucleotide/amino acid substitution has an associated likelihood • A function is derived to represent the likelihood of the data given the tree, branch-lengths and additional parameters • Function is minimized
1 1 1 3 2 3 2 3 4 4 4 2 0.25 L0 T 10-6 L1 L2 T G 2 x 10-6 L4 L5 L6 T T A G Tree 1 Based on a model of nucleotide substitution matrix (transitions and transversions) A C G T A 1 10-6 2 x10-6 10-6 C 1 2 x10-6 10-6 10-6 1 G 10-6 10-6 2 x10-6 T 10-6 2 x10-6 10-6 1 1: ACGCG T T GG G 2: ACGCG T T GG G 3: ACGCAA T GAA 4: ACACAGGGAA L(Tree 1) = L0 x L1 x L2 x L3 x L4 x L5 x L6 = 5 x 10-13
Consider every possible base assignment to each node and calculate the likelihood 1 3 2 4 0.25 L0 L0 T C 10-6 L1 L2 L1 L2 2 x 10-6 T G T G 2 x 10-6 L4 L5 L6 L3 L4 L5 L6 T T A G T T A G Tree 1 Tree 2 1: ACGCG T T GG G 2: ACGCG T T GG G 3: ACGCAA T GAA 4: ACACAGGGAA Repeat for each of node assignment, and each site in alignment. Probability of that unrooted tree is the sum of all individual trees. Repeat for each unrooted tree and choose the tree with the highest liklihood. L(Tree 1) = L0 x L1 x L2 x L3 x L4 x L5 x L6 = 5 x 10-13 L(Tree 2) = L0 x L1 x L2 x L3 x L4 x L5 x L6 = 1 x 10-18
6000 – 10,000 years ago The Teosinte-Maize Story • The practical side of sequence diversity • PLANT BREEDING! • Sequence Diversity in Teosinte • Sequence Diversity in Maize • Selection During Domestication and Improvement
Sequence Diversity and Plant Breeding • Genetic diversity within a crop species is the raw material for current plant breeding • Genetic diversity is the insurance policy to enable plant breeders to adapt crops to changing environments
Bushels Per Acre Single Cross Hybrids Open Pollinated Varieties Double Cross Hybrids Year The Problem • To what degree is limiting genetic diversity • inhibiting genetic improvement in corn?