1 / 54

Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy)

Computational Molecular Biology Biochem 218 – BioMedical Informatics 231 http://biochem218.stanford.edu/. Phylogenies. Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy). Cladogram Representation of Phylogenies. 100. 0. Years x 10 6. Years x 10 6. 0. 100.

etan
Download Presentation

Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Molecular BiologyBiochem 218 – BioMedical Informatics 231http://biochem218.stanford.edu/ Phylogenies Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy)

  2. Cladogram Representation of Phylogenies 100 0 Years x 106 Years x 106 0 100 A B C D E F G H I J L M N O P Q

  3. Dendrogram Representation of Phylogenies GrowTree Phylogram February 1, 2010 Substitutions per 100 Residues 100.00

  4. Cladogram

  5. Phenogram

  6. Curve-O-Gram

  7. Eurogram

  8. RadialGram

  9. Methods for Determining Phylogenies • Parsimony (character based) • Assigns mutations to branches • Minimize number of edits • Topology maximizes similarity of neighboring leaves • Distance methods • Branch lengths = D(i,j)/2 for sequences i, j • Distances must be at least metric • Distances can reflect time or edits • Distance must be relatively constant per unit branch length A B C D E F G H I J L M N O P Q

  10. Methods for Determining Phylogenies • Parsimony • Minimum mutation (Fitch, PAUP) • Minimal length encoding • Probabilistic • Branch and Bound • Maximum likelihood • Distance methods • Ultrametric Trees • Additive Trees • UPGMA • Neighbor Joining A B C D E F G H I J L M N O P Q

  11. Properties of Trees • Rooted or Unrooted • Nodes and Branches • Internal Nodes • External Nodes - leaves • Operational Taxonomic Units • Outgroups • Topology • One path/pair • Distances 1 X 3 Y 2 Z 4 R2 3 Y 5 4 R1 Z 1 5 X 2 1 2 X R1 R2 Z 3 Y 5 4

  12. Orthologous Evolution

  13. Paralogous Evolution hemoglobins

  14. Challenges Making Trees:Gene Duplication versus Speciation

  15. Orthology and Paralogy Orthologs:Derived by speciation Paralogs: Gene Duplications Yeast HA1 Human HA2 Human WA Worm HB Human WB Worm Thanks to Seraphim Batzoglou

  16. Challenges Making Trees:Gene Conversion A A T C G C G A T A G C A T C A A T T C C C T C Thanks to Maryellen Ruvolo

  17. Challenges Making Trees:Gene Conversion A A T C G C G A T A G C A T C C G C G A T C TC A T C A A T T C C C T C Thanks to Maryellen Ruvolo

  18. A B C D A B C D Challenges Making Trees:Gene Conversion Orthologs:Derived by speciation Paralogs: Gene Duplications Gene M Gene N Thanks to Maryellen Ruvolo

  19. Challenges Making Trees:CM Has Been Converted from CN Orthologs:Derived by speciation Paralogs: Gene Duplications AM BM DM AN BN CMCN DN Gene N Gene M Thanks to Maryellen Ruvolo

  20. Consensus CG/LH Tree Thanks to Maryellen Ruvolo

  21. Gene conversion between1st & 2nd exons of LH, CG2 Genes 15nt LH Gene 168 nt Conversion No Conversion CG2 Gene 168 nt Thank Thanks to Maryellen Ruvolo

  22. Challenges Making Trees: Varying Rates of Mutation

  23. Challenges Making Trees:Horizontal Gene Transfer

  24. Maximum Ultrametric Distance Trees Matrix D Tree T 8 5 3 3 A E D B C • Matrix D is ultrametric for tree T if: • If D is a symmetric n by n matrix of distances • T contains n leaves, one from each row or column • Each node of T labeled by one entry from D • Numbers from root to leaves strictly decrease • For any two leaves i, j, D(i,j) labels nearest common ancestor of i and j in tree

  25. Maximum Ultrametric Distance Trees A symmetric matrix D is ultrametric if and only if for every three leaves i, j, and k, there is a tie for the maximum distance between D(i,j), D(i,k) and D(j,k). U V I J K

  26. Additive Distance Trees Tree T Matrix D A C 2 2 3 1 4 B D

  27. Distance Metrics Obey theTriangle Inequality • D(i,j) ≤ D(i,k) +D(j,k) for all i, j, k • (Max Score - Smith-Waterman Score) is a Metric if • If Gap-penalty ≥ 1+ Gap-size/(n-1) • Assuming match = 1 and mismatch = -1

  28. Three Leaf Tree 1 L1,A A L3,A 3 Observe D1,2 D1,3 D2,3 L2,A 2 Calculate L1,A L2,A L3,A

  29. Three Leaf Tree 1 L1,A L3,A A 3 L2,A Observe D1,2 D1,3 D2,3 2 Calculate L1,A L2,A L3,A D1,2=L1,A+L2,A D1,3=L1,A+L3,A D2,3=L2,A+L3,A

  30. Solution to Three Species Tree 1 L1,A A L3,A 3 L2,A 2

  31. 1 3 4 Four Species Tree L1,A L3,A B A A,B L2,A L4,A Observe D1,2 D1,3 D1,4 D2,3 D2,4 D3,4 2 Calculate L1,A L2,A L3,B L4,B, LA,B

  32. Four Species Topology 1 3 L1,A L3,A B A LA,B L2,A L4,A 4 2 Label species 1, 2, 3, and 4 so that: D(1,2) + D(3,4) ≤ D(1,3) + D(2,4) = D(1,4) + D(2,3)

  33. Solution for Four Species 3 1 L1,A L3,A A LA,B B L4,A L2,A 4 2 L1,A = 1/4*(D1,3 + D1,4 - D2,3 -D2,4) + 1/2*D1,2 L2,A = 1/4*(D2,3 + D2,4 - D1,3 - D1,4) + 1/2*D1,2 LB,3 = 1/4*(D1,3 + D2,3 - D1,4 - D2,4) + 1/2*D3,4 LB,4 = 1/4*(D1,4 + D2,4 - D1,3 - D2,3) + 1/2*D3,4 LA,B = 1/4*(D1,3 + D1,4 + D2,3 + D2,4) - 1/2*(D1,2 + D3,4)

  34. 1 1 3 2 4 4 2 3 1 3 4 2 Four Species =>Three Topologies A B B A A

  35. Species, Distances, Branches & Topologies

  36. Species, Distances, Branches & Topologies

  37. Number of Topologies for n Species

  38. 1 3 2 4 1 34 2 UPGMA: Unweighted Pair GroupMethod with Arithmetic Average Where D1,(34) = (D1,3+D1,4)/2 and D2,(34) = (D2,3+D2,4)/2

  39. 1 2 3 4 UPGMA Dendrogram 1 3 4 2

  40. UPGMA Clustering

  41. Neighbor Joining Method n n-1 1 n n-2 n-1 2 n-2 i X Y n-3 3 ... j n-4 3 4 2 1 n-5 5 . . .

  42. Nearest Neighbor Dendrogram R Y X Z W D E C A B

  43. New Hampshire Standard Tree

  44. SeqWeb GrowTree Programhttp://seqweb.stanford.edu:81/gcg-bin/analysis.cgi?program=evolution-prot

  45. GrowTree Parameters

  46. GrowTree Distanceshttp://seqweb.stanford.edu:81/gcg-bin/analysis.cgi?program=evolution-prot

  47. GrowTree Phylogram (UPGMA)

  48. GrowTree Alignment

  49. GrowTree Neighbor Joining Treehttp://seqweb.stanford.edu:81/gcg-bin/analysis.cgi?program=evolution-prot

  50. GrowTree VegF Inputhttp://seqweb.stanford.edu:81/gcg-bin/analysis.cgi?program=evolution-prot

More Related