1 / 109

break

break. Evolutionary rates. Reference: Dan ’ s book chapter 4. Evolutionary rates - history. The first to suggest using DNA and proteins to investigate evolutionary history. (They discussed molecular evolution before the genetic code was established). Linus Pauling (1901-1994).

patience
Download Presentation

break

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. break

  2. Evolutionary rates Reference: Dan’s book chapter 4

  3. Evolutionary rates - history • The first to suggest using DNA and proteins to investigate evolutionary history. • (They discussed molecular evolution before the genetic code was established).

  4. Linus Pauling (1901-1994) • The only person ever to receive two unshared Nobel Prizes—for Chemistry (1954) and for Peace (1962). • His introductory textbook General Chemistry, revised three times since its first printing in 1947 and translated into 13 languages, has been used by generations of undergraduates.

  5. Linus Pauling (1901-1994) • Also wrote popular science books, e.g., “How to Live Longer and Feel Better”, and “Vitamin C and the Common Cold”. • Published over 1,000 articles and books. • Used to protest against nuclear testing.

  6. Linus Pauling (1901-1994) • He received a Ph.D. in chemistry and mathematical physics from California Institute of Technology (Caltech) in 1925 (age 24).

  7. Evolutionary rates Rate is distance divided by time. Distance is number of substitutions per site. Time is in years. The time must be doubled, because the sequences evolved independently. d

  8. Evolutionary rates This formula is not accurate for closely related taxa, in which polymorphism must be taken into account (Takahata and Satta 1997).

  9. Mean Rate of Nucleotide Substitutions in Mammalian Genomes ~10-9 Substitutions/site/year Evolution is a very slow process at the molecular level (“Nothing happens…”)

  10. Sequence alignments Alignment is needed for phylogeny and for molecular evolution. We will assume that the alignment is given. How to construct alignment is outside the scope of this course.

  11. Synonymous vs. nonsynonymous substitutions For most proteins, it is observed that the rate of synonymous substitutions (silent substitutions) is much larger than the nonsynonymous rate (amino-acid modifying substitutions). UUU -> UUC (both encode phenylalanine ): synonymous UUU -> CUU (phenylalanine to leucine): nonsynonymous

  12. A lot A little

  13. Synonymous vs. nonsynonymous substitutions

  14. Synonymous vs. nonsynonymous substitutions

  15. Empirical findings: Important proteins evolve slower than unimportantones.

  16. break

  17. Insulin

  18. Insulin 1953, Frederick Sanger determines the amino-acid sequence of insulin. This is the FIRST protein whose amino-acid sequence was determined. It demonstrated that insulin is comprised of only L-amino acids.

  19. Insulin Insulin was characterized to be composed of two chains (A&B), linked together by S-S bonds. 21 AA 30 AA

  20. Insulin • How is the 2 chain protein synthesized? • Donald Steiner (University of Chicago) gave the answer. • He studied an islet-cell adenoma of the pancreas, a rare human tumor producing large amounts of insulin.

  21. Adenoma • Adenoma is a benign tumor (not a malignant tumor). Benign in English = harmless • Benign tumor: A tumor that does not recur locally and does not spread to other parts of the body. • Adenoma is from a glandular (i.e., from a gland) origin. • Adenomas can grow from many organs including the colon, adrenal, pituitary, thyroid.

  22. Insulin • He sliced the pancreatic tumor and incubated it with tritiated leucine and then analyzed it. • He found a new protein that was later proven to be the biosynthetic precursor of insulin, the proinsulin.

  23. Insulin • Proinsulin has 30 residues that are absent from insulin.

  24. Insulin • There is even a former form of proinsulin, called preproinsulin. It contains additional 19 AA at the N-terminus. This 19 AA hydrophobic stretch directs the preproinsulin to the ER. • Preproinsulin -> Proinsulin (ER membrane) • From the ER it moves on to the Golgi and then to secretory granules. • Proinsulin -> Insulin (Granules)

  25. Alignment of preproinsulin Xenopus MALWMQCLP-LVLVLLFSTPNTEALANQHL Bos MALWTRLRPLLALLALWPPPPARAFVNQHL **** : **.*: *:..* :. *:**** Xenopus CGSHLVEALYLVCGDRGFFYYPKIKRDIEQ Bos CGSHLVEALYLVCGERGFFYTPKARREVEG ***************:******* :*::* Xenopus AQVNGPQDNELDG-MQFQPQEYQKMKRGIV Bos PQVG---ALELAGGPGAGGLEGPPQKRGIV .**. ********* Xenopus EQCCHSTCSLFQLENYCN Bos EQCCASVCSLYQLENYCN *****.***:*******

  26. Empirical findings: Functional regions evolve slower than nonfunctionalregions.

  27. Rates of amino-acid replacements in different proteins

  28. Clotting – The end reaction thrombin fibrinogen fibrin

  29. Synonymous vs. nonsynonymous substitutions Histone H4 between human and wheat: excess of synonymous substitutions

  30. Mean nonsynonymous rate 0.74  0.67 (10-9 substitutions per site per year) Mean synonymous rate 3.51  1.01 (10-9 substitutions per site per year)

  31. The coefficient of variation is an attribute of a distribution: its standard deviation divided by its mean Coefficient of variation of nonsynonymous rate 91% Coefficient of variation of synonymous rate 29%

  32. Transition vs. transversion rates Ratio 1.5 4.4 1.1 Degeneracy class 4 2 0

  33. break

  34. Computing synonymous and non-synonymous rates Silent and non-silent…

  35. Computing synonymous and non-synonymous rates 3 3

  36. Ka/Ks • Our goal is to be able to compare two (or later, more) sequences and to compare the rate of neutral evolution (determined by the synonymous rate) with than of the non-synonymous rate. • The lower the ratio of non-synonynous substitutions to synonymous ones, the higher the intensity of the purifying selection.

  37. Computing synonymous and non-synonymous rates p-distance of synonymous subs. = 3/6 p-distance of nonsynonymous subs. = 3/6 3 3 Problematic: p-distance does not correct for multiple substitutions… Solution: compute the JC correction to the p-distance.

  38. Computing synonymous and non-synonymous rates Assume a protein without selection (evolving neutrally). CAA (Gln) GAA (Glu) TAA (Stop) AAC (Asn) ACA (Thr) AAG (Lys) AAA (Lys) AGA (Arg) AAT (Asn) ATA (Ile) The random chance of a synonymous substitution is much smaller than the chance of a nonsynonymous one.

  39. Computing synonymous and non-synonymous rates Assume a protein without selection (evolving neutrally). ACA (Thr) CCA (Pro) TCA (Ser) GCC (Ala) GAA (Glu) GCG (Ala) GCA (Ala) GGA (Gly) GCT (Ala) GTA (Val) This is also different for different codons.

  40. Computing synonymous and non-synonymous rates So when one “observe” 6 times more nonsynonymous substitutions than synonymous ones – does it indicate that the protein is under purifying selection??? We must normalize for the potentials for silent vs. non-silent mutations of the codons in question.

  41. break

  42. Nei & Gojobori (1986)method Masatoshi Nei Takashi Gojobori

  43. Counting synonymous sites Consider a particular position in a codon (j=1,2,3). Let fj be the fraction of synonymous changes at this site.

More Related