610 likes | 879 Views
FUNDAMENTALS OF MOLECULAR EVOLUTION. The evolutionary thinking. Russel Wallace writes to Charles Darwin (June 17 th 1858). Ernst Haeckel (mid-19 th Century): the tree of life. The neo-synthesis (Fisher, Heldane, and Wright, 1930-1950). The molecular REvolution.
E N D
The evolutionary thinking • Russel Wallace writes to Charles Darwin (June 17th 1858) • Ernst Haeckel (mid-19th Century): the tree of life • The neo-synthesis (Fisher, Heldane, and Wright, 1930-1950)
The molecular REvolution • Nuttal, 1904: Serological cross-reactions to study phylogenetic relationships among various group of animals. • Watson and Crick beautiful helix! • Zuckerland and Pauling, 1965: molecular clocks. • Fitch & Margoliash, 1967: Construction of phylogenetic trees.A method based on mutation distances as estimated from cytochrome c sequences is of general applicability (Science, 155:279-284). • Kimura, 1968: Evolutionary rate at the molecular level (Nature, 217:624-626). The birth of molecular evolution
Transitions and transversions A C G T • Transitions () are purine (A, G) or pyrimidine (C, T) mutations: Pu-Pu, Py-Py • Transversions () are purine to pyrimidine mutations or the reverse: (Pu-Py, or Py-Pu).
Point mutations and the genetic code • 4 possible transitions: AG, CT • 8 possible transversions: AC, AT, GC, GT • Thus if mutations were random, transversions are 2 times more likely than transitions. • Due to steric hindrance (as well as negative selection!), the opposite is true, transitions occur in general more often than transversions[2-15 times more, depending on the gene region and the species].
Other mutations • Insertions and deletions (indels). Usually by 3 nucleotides in coding regions. • Recombination. Often in viruses. • Gene (or chromosome) duplication • Lateral gene transfer
Genetic variation in populations • Polymorphism: 2 (or more) mutations co-exist (alleles) in a population of organisms. • Diploid organisms can be homozygous (2 identical alleles) or heterozygous (2 different alleles) at a particular locus. • For viruses, the term quasispecies is often used. • The variation in a population can be described in allele frequencies or gene frequencies
Evolution and fixation of mutations Evolutionary forces work at the level of populations • Evolution is the consecutive fixation of mutations • The fixation rate of such a polymorphism is in fact the evolutionary rate. This is dependant on: • Mutation rate • Generation time • Evolutionary forces, such as fitness, selective pressure, population size
Population genetics • Selective Pressure • Random Genetic drift • Effective Population size (Ne) • Mutation rate and evolutionary rate
Selective pressure • Positive selective pressure: mutant is more fit • Negative selective pressure: mutant is less fit • Balancing selection: heterozygote is more fit • Most synonymous mutations can be considered neutral • Non-synonymous mutations are always subject to selective pressure (?)
Population dynamics 1 fixed mutation polymorphism maintained ALLELE FREQUENCY lost mutation 0 TIME
Effective population size 1st generation N=10, Ne=5 2nd generation N=10, Ne=4 3rd generation N=10, Ne=3 4th generation N=5, Ne=2 Bottleneck event 5th generation N=3, Ne=2 Mutation event 6th generation N=9, Ne=4 7th generation N=11
Mutation and evolutionary rate • New mutation in a diploid population of N individuals • Fixation time t (Kimura and Otha 1969): • t = 2/s ln (2N) (s = selective advantage) • t = 4N for neutral mutations • Evolutionary Rate (or substitution rate), r: • number of mutants reaching fixation per unit time • Mutation Rate, m: • rate of mutation at the DNA level (biochemical concept)
Molecular clocks • In general the evolutionary rate r can be expressed as • r = f • f fraction on neutral mutation • mutation rate • If f is constant and • is constant The rate of evolution is constant (molecular clock) • Under neutral evolution (f = 1) • r = mutation rate = evolutionary rate (Kimura 1968)
A global molecular clock? The hypothesis known as global clock was based on the observation that a linear relation seems to exist between the number of amino acid substitutions between homologous proteins of different species, and the species divergence times estimated from archaeological data.
Evolutionary rates of organisms nucleotide substitutions per site per year 10 - 9 10 - 8 10 - 7 10 - 6 10 - 5 10 - 4 10 - 3 10 - 2 10 - 1 cellular genes RNA viruses DNA viruses Human mtDNA
Why is the molecular clock attractive ? • If macromolecules evolve at constant rates, they can be used to date species-divergence times and other types of evolutionary events, similar to the dating of geological time using radioactive elements • Phylogenetic reconstruction is much simpler under constant rates that under nonconstant rates • The degree of rate variation among lineages may provide much insight into the mechanisms of molecular evolution (e.g. Kimura 1983; Gillespie 1991; Salemi et al., 1999).
Deterministic or stochastic model of evolution • Deterministic: fixation of mutations is entirely dependent on selective pressure. Alleles do not get lost or fixed by chance (by accident). • Stochastic: fixation is dependent on chance events. Chance effect is much larger than selective pressure, random genetic drift plays a big role. • Whether or not selective pressure plays a role can be tested by comparing synonymous with non-synonymous rates of substitution.
Neo-Darwinism - Neutral evolution • Neo-Darwinism: • Random mutations are source of variation. • A majority of non-synonymous mutations are deleterious, there is a strong negative selective pressure. • Most non-synonymous mutations become fixed because of positive selection • Most synonymous mutations become fixed because of random genetic drift • Neutral evolution (Kimura): • Random mutations are source of variation. • A majority of non-synonymous mutations are deleterious, there is a strong negative selective pressure. • Most mutations that become fixed are neutral, rarely positive selective pressure is strong enough to fix adaptive mutations.
The data used for phylogenetic analysis • Morphological characters • Fossils (not for viruses) • Genetic data: • AA or NT sequences • RFLP • Allele frequencies • ... • A combination of these data
Rooted phylogenetic tree Branches can rotate freely. Branching order is called topology External node or Operational Taxonomic Units OTU (or Taxon) A G node H B J K C Internal node or Hypothetical Taxonomic Units HTU (or Ancestor) D I root E branch F TIME
Unrooted phylogenetic tree F D I J E C H G A B • Root node K disappeared • To root an unrooted tree: • root by outgroup, e.g. use F as outgroup • midpoint rooting Monophyletic taxa
Coalescence time on a rooted tree A G B H J C D I E F Most recent common ancestor of all taxa (MRCA) O r = OF/T1 T2 = IE/r = ID/r … T1 T2 TIME Coalescence time of all taxa
Evolutionary rate estimates using viral strains of known isolation time • Fast evolving viruses (e.g. HIV, HCV) can be sampled over time from the same patients or from different patients at different time points (longitudinal sampling) • The evolutionary rate can be calculated by using the difference in evolutionary distance and the time interval of isolation • ML and Bayesian methods can estimate simultaneously branch lengths and evolutionary rate from a tree with longitudinally sampled sequences (Rambaut, 2000; Drummond et al., 2006) 1983 d T 1995 1997
Phylogeny Inference and the controversy over the origin of HIV
The retroviruses • small RNA genome (9-10 Kb) • Unique replication cycle • extremely fast evolutionary rate (10-5 - 10-2) • Isolated from most vertebrate species • Associated with human and animal diseases
R U5 R U3 R U5 U3 R U5 Retroviral genome gag pol env pX ssRNA genome U3 reverse transcription LTR LTR gag pol env pX dsDNA TM SU Protease Polymerase major encoded proteins Tax/Tat Rex/Rev gag-pol env mRNAs Rev /Rex Tat / Tax
Global HIV-1 Pandemic: 1996 vs. 2005 300,000 470,000 780,000 450,000 270,000 200,000 4.8 million 1.3 million 14 million < 35,000 2005 (~ 40 million people infected) 1996 (~ 20 million people infected)
No, HIV-1 came from chimpanzees… Pan troglodytes (Gao et al., Science 1999)
Where did the pandemic originate? (Gao et al., Science 1999)
The River: the “interesting” journey of E. Hooper • A controversial theory, described by Edward Hooper in his book "The River: a journey to the source of HIV and AIDS" (1999), claims that HIV-1 originated at the end of the 1950s when live oral polio vaccines (OPV), contaminated with SIV, were administered to African children. • Testing the Hooper’s hypothesis by dating the the most recent common ancestor (MRCA) of HIV-1/SIVcpz and of HIV-1 group M by molecular clock analysis (Korber et al. 2000, Salemi et al., 2001).
HIV-1 group M: 41 pol strains 2DLn(l) Date 140 1940 120 1930 100 1920 80 1910 60 1900 40 1890 20 1880 0 1870 SSCD Pol 2Ln(L) ) Date HIV-1 group M common ancestor # sites removed HIV-1 group M common ancestor: 1931 (1921 - 1941) 0.1 (nucleotide substitutions per site) Salemi et al., 2001
2DLn(l) Date HIV-1 group M: 61 env strains SSCD Env SIVCPZ A 140 1940 G C 120 D 1930 100 1920 80 Date 2Ln(L) 1910 HIV-1 group M common ancestor 60 1900 40 B 1890 20 0 1880 0 5 50 55 20 25 30 45 40 10 15 35 # sites removed HIV-1 group M common ancestor: 1933 (1918 - 1948) 0.1 (nucleotide substitutions per site) Salemi et al., 2001
Is the OPV campaign in Africa during the late 1950s to blame for the beginning of the AIDS epidemic? • In contrast with Hooper’s theory, our method do not support the “OPV scenario” since the radiation date for HIV-1 group M was around 1930. • An even older time for the separation of SIVcpz and HIV-1 can be calculated (~1700 A.D.) [Salemi et al. 2001]
HIV Infection in Benghazi, Libya… In May 1998, the Al-Fateh Children’s Hospital (AFH) in Benghazi, Libya1 noted their first case of HIV-1 infection. In September 1998, another 111 children who had been admitted to the hospital were found to be HIV-1 positive. In total 418 children resulted HIV-1 positive and 300 HCV positive…
HIV Infection in Benghazi, Libya… The outbreak was reported by local hospital authorities and representatives from the World Health Organization(WHO) were sent to AFH in December 1998 to examine the cause of the infections. WHO report suggests that there were multiple nosocomial HIV-1/HCV It also noted the lack of medical material in the hospital
Libyan families pressure. Benghazi is the second biggest city and rebellious about Gaddafi…
Trial begins… Medics in Jail. In March 1998 six foreign medics (five Bulgarian nurses and a doctor from Palestine) joined the medical staff at AFH. One year later (March 1999), these individuals were accused of purposefully infecting more than 400 children with HIV-1.
Libyan court However, the Libyan court found this report to be un-precise and lacking in evidence and therefore decided not to consider its findings in the trial4. In December 2003, a second scientific report produced by Libyan researchers was written for the court5. In May 2004, the foreign medical staff were sentenced to death. Libyan court.
Scientists and Nature involvement… Nature magazine reporter Declan Butler becomes involved in the case. 6 Nature editorials and news are published between September and November 2006. • A shocking lack of evidence (Nature 443, 888-889, 26 October 2006) • Protests mount against Libyan trial (Nature 443, 612-613, 12 October 2006) • Forgotten plights (Nature 443, 605-606, 12 October 2006) • Dirty needles, dirty dealings (Nature 443, 2 October 2006) • Libya's travesty (Nature 443, 245, 21 September 2006) • Lawyers call for science to clear AIDS nurses in Libya (Nature 443, 254 21 September 2006)
Scientists around the world ask to fair trial and scientific facts. A new trial begins in October 2006, sentence by December 2006