1 / 72

Presented By Dr. Shazzad Hosain Asst. Prof. EECS, NSU

Phylogeny. Presented By Dr. Shazzad Hosain Asst. Prof. EECS, NSU. What is phylogenetics?. Phylogenetics is the study of evolutionary relationships among and within species. birds. snakes. rodents. primates. crocodiles. marsupials. lizards. crocodiles. birds. lizards. snakes.

Download Presentation

Presented By Dr. Shazzad Hosain Asst. Prof. EECS, NSU

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Phylogeny Presented By Dr. ShazzadHosain Asst. Prof. EECS, NSU

  2. What is phylogenetics? Phylogenetics is the study of evolutionary relationships among and within species. birds snakes rodents primates crocodiles marsupials lizards

  3. crocodiles birds lizards snakes rodents primates marsupials What is phylogenetics? This is an example of a phylogenetic tree.

  4. Applications of phylogenetics • Forensics: Did a patient’s HIV infection result from an invasive dental procedure performed by an HIV+ dentist? • Conservation: How much gene flow is there among local populations of island foxes off the coast of California? • Medicine: What are the evolutionary relationships among the various prion-related diseases? To be continued…

  5. Sequence A Sequence B Sequence C Sequence D Sequence E Phylogenetic concepts:Interpreting a Phylogeny Which sequence is most closely related to B? A, because B diverged from A more recently than from any other sequence. Physical position in tree is not meaningful! Only tree structure matters. Time

  6. A A A B B ? ? X X B ? = = Root Root ? C ? ? D D C C D Time Phylogenetic concepts:Rooted and Unrooted Trees

  7. chicken human fruit fly chicken oak human – bones + bones bacteria oak archaea – cell nuclei fruit fly bacteria archaebacteria oak bacteria archaebacteria fruit fly + cell nuclei human chicken Rooting and Tree Interpretation

  8. Rooting Methods Outgroup Rooting a network of relationships Given an unrooted network of relationships among four species of Carnivora [left], outgroup rooting uses an additional taxon (the outgroup) known from independent evidence to be less closely related to any of the other species (the ingroup) than they are to each other. The root is then placed on the branch between the outgroup and the ingroup. In this case, Lynx is a feloid carnivore in a separate superfamily from the four canoid carnivores. Inclusion of Lynx in the network analysis places it on the internode.This method requires accurate information as to ingroup / outgroup relationships.  

  9. How Many Trees? (assuming bifurcation only)

  10. Unrooted trees Rooted trees # sequences # pairwise distances # trees # branches /tree # trees # branches /tree 3 3 1 3 3 4 4 6 3 5 15 6 5 10 15 7 105 8 6 15 105 9 945 10 10 45 2,027,025 17 34,459,425 18 30 435 8.69  1036 57 4.95  1038 58 N N (N - 1) 2 (2N - 5)! 2N - 3 (N - 3)! 2N - 3 (2N - 3)! 2N - 2 (N - 2)! 2N - 2 How Many Trees?

  11. Ultrametricity All tips are an equal distance from the root. Additivity Distance between any two tips equals the total branch length between them. X X a a Y b b e Y e c c d d Root Root a = b + c + d + e XY = a + b + c + d + e Tree Properties In simple scenarios, evolutionary trees are ultrametric and phylograms are additive.

  12. Terminology • External nodes: things under comparison; operational taxonomic units (OTUs) • Internal nodes: ancestral units; hypothetical; goal is to group current day units • Root: common ancestor of all OTUs under study. Path from root to node defines evolutionary path • Unrooted: specify relationship but not evolutionary path • If have an outgroup (external reason to believe certain OTU branched off first), then can root • Topology: branching pattern of a tree • Branch length: amount of difference that occurred along a branch

  13. Phylogeny Applications • Tree of Life: Analyzing changes that have occurred in evolution of different organisms http://tolweb.org/tree/phylogeny.html • Phylogenetic relationships among genes can help predict which ones might have similar functions (e.g., ortholog detection) • Follow changes occurring in rapidly changing species (e.g., HIV virus)

  14. Phylogeny Packages • PHYLIP, Phylogenetic inference package • evolution.genetics.washington.edu/phylip.html • Felsenstein • Free! • PAUP, phylogenetic analysis using parsimony • paup.csit.fsu.edu • Swofford

  15. Similarity vs. Homology • Similar • sequences resemble one another • Homolog • sequences derived from common ancestor • Ortholog • homologous sequences within a species • Paralog • homologous sequences between species

  16. Ortholog vs. Paralog • Ortholog • genomic variation occurs after speciation • hence can be used for phylogeny of organism • Paralog • genetic duplication occurs before speciation • hence not suitable for phylogeny of organism

  17. Homoplasy • Sequence similarity NOT due to common ancestry • May arise due to parallelism or convergent evolution • Parallelism or parallel evolution • the development of a similar trait in related, but distinct, species descending from the same ancestor, but from different clades • Convergent evolution

  18. Parallel evolution Parallel evolution occurs when two species that have descended from the same ancestor remain similar over long periods of time because they independently acquire the same evolutionary adaptations. Parallel evolution occurs because genetically related species adapt to similar environmental changes in similar ways. After many years, the organisms may still resemble each other, even though they speciated in the distant past.

  19. Convergent evolution when species from different ancestors colonize the same environment, they may independently acquire the same adaptations. The evolution of species descended from different ancestors to become superficially similar because they are adapting to the same environment is called convergent evolution

  20. Divergent Evolution

  21. Phylogeny of what? • Organisms • Whole genome phylogeny • Ribosomal RNA (surrogate for whole genome) • Strains (closely related microbes) • Individual genes (or gene families) • Repetitive DNA sequences • Metabolic pathways • Secondary Structures • Any discrete character(s) • Human languages • Microbial communities

  22. Why compute phylogenetic trees? • Understand evolutionary history • Map pathogen strain diversity for vaccines • Assist in epidemiology • Of infectious diseases • Of genetic defects • Aid in prediction of function of novel genes • Biodiversity studies • Understanding microbial ecologies

  23. Tree Building Exercises

  24. Computational Approaches toPhylogenetic Tree Computation • Distance Based Methods • UPGMA • Neighbor joining • Character State Methods • Maximum Parsimony Method • Maximum Likelihood Methods • Tree merging • Consensus trees, super-trees

  25. What data is used to build trees? • Traditionally: morphological features (e.g., number of legs, beak shape, etc.) • Today: Mostly molecular data (e.g., DNA and protein sequences)

  26. Data for Phylogeny • Can be classified into two categories: • Numerical data • Distance between objects • e.g., distance(man, mouse)=500, • distance(man, chimp)=100 • Usually derived from sequence data • Discrete characters • Each character has finite number of states • e.g., number of legs = 1, 2, 4 • DNA = {A, C, T, G}

  27. UPGMA

  28. UPGMA

  29. 2. Determine the evolutionary distances and build distance matrix - A simple example • AGGCCATGAATTAAGAATAA • AGCCCATGGATAAAGAGTAA • AGGACATGAATTAAGAATAA • AAGCCAAGAATTACGAATAA Distance Matrix In this example the evolutionary distance is expressed as the number of nucleotide differences for each sequence pair. For example, sequences 1 and 2 are 20 nucleotides in length and have four differences, corresponding to an evolutionary difference of 4/20 = 0.2.

  30. 3. Phylogenetic Tree Construction example (UPGMA algorithm) 1. Pick smallest entry Dij 2. Join the two intersecting species and assign branch lengths Dij/2to each of the nodes UPMGA (Michener & Sokal 1957) Bear Raccoon 0.130.13

  31. 3. Phylogenetic Tree Construction example (UPGMA algorithm) Bear Raccoon 0.13 0.13 3.Compute new distances to the other species using arithmetic means

  32. 3. Phylogenetic Tree Construction example (UPGMA algorithm) Bear Raccoon Seal 0.13 0.18250.1825 • 1. Pick smallest entry Dij • 2. Join the two intersecting species and assign branch lengths Dij/2 to each of the nodes

  33. 3. Phylogenetic Tree Construction example (UPGMA algorithm) Bear Raccoon Seal 0.13 0.18250.1825 • Compute new distances to the other species using arithmetic means

  34. 3. Phylogenetic Tree Construction example (UPGMA algorithm) Bear Raccoon Seal Weasel 0.13 0.1825 0.2 0.2 • Pick smallest entry Dij. • Join the two intersecting species and assign branch lengths Dij/2 to each of the nodes. • Done!

  35. Downside of UPGMA • Assume molecular clock (assuming the evolutionary rate is approximately constant) • Generates only rooted tree • Trees are ultrametric • Doesn’t work the following case:

  36. Computational Approaches toPhylogenetic Tree Computation • Distance Based Methods • UPGMA • Neighbor joining • Character State Methods • Maximum Parsimony Method • Maximum Likelihood Methods • Tree merging • Consensus trees, super-trees

  37. Neighbor-joining method • Developed in 1987 by Saitou and Nei • Works in a similar fashion to UPGMA • Still fast – works great for large dataset • Doesn’t require the data to be ultrametric • Great for largely varying evolutionary rates

  38. How to construct a tree with Neighbor-joining method? • Step 1: • Calculate sum all distance from x and divide by (leaves – 2) • Sx = (sum all Dx) / (leaves - 2) • Step 2: • Calculate pair with smallest M • Mij = Distance ij – Si – Sj • Step 3: • Create a node U that joins pair with lowest Mij • S1U = (Dij / 2) + (Si – Sj) / 2

  39. How to construct a tree with Neighbor-joining method? • Step 4: • Join I and j according to S and make all other taxa in form of a star • Step 5: • Recalculate new distance matrix of all other taxa to U with: • DxU = Dix + Djx - Dij

  40. Example of Neighbor-joining • Step 1: S calculation : Sx = (sum all Dx) / (leaves - 2) • S(A) = (5 + 4 + 7 + 6 + 8) / 4 = 7.5 • S(B) = (5 + 7 + 10 + 9 + 11) / 4 = 10.5 • S(C) = (4 + 7 + 7 + 6 + 8) / 4 = 8 • S(D) = (7+ 10 + 7 + 5 + 9) / 4 = 9.5 • S(E) = (6 + 9 + 6 + 5 + 8) / 4 = 8.5 • S(F) = (8 + 11 + 8 + 9 + 8) / 4 = 11

  41. Example of Neighbor-joining cont 1 • Step 2: Calculate pair with smallest M • Mij = Distance ij – Si – Sj • Smallest are • M(AB) = d(AB) – S(A) –S(B) = 5 – 7.5 – 10.5= -13 • M(DE) = 5 – 9.5 – 8.5 = -13

  42. Example of Neighbor-joining cont 2 • Step 3: Create a node U • S1U = (Dij / 2) + (Si – Sj) / 2 • U1 joins A and B: • S(AU1) = d(AB) / 2 + (S(A) – S(B)) / 2 • = 5 / 2 + (7.5 - 10.5) / 2 = 1 • S(BU1) = d(AB) / 2 + (S(B) – S(A)) / 2 • = 5 / 2 + (10.5 – 7.5) / 2 = 4

  43. Example of Neighbor-joining cont 3 • Step 4: Join A and B according to S, and make all other taxa in form of a star. Branches in black are unknown length and Branches in red are known length

  44. Example of Neighbor-joining cont 4 • Step5: Calculate new distance matrix • Dxu = (Dix + Djx – Dij) / 2 • d(CU) = (d(AC) + d(BC) - d(AB)) / 2 • = (4 + 7 - 5) / 2 =3 • d(DU) = d(AD) + d(BD) - d(AB) / 2 = 6 • Same as EU and FU • Then we get the new distance matrix

  45. Example of Neighbor-joining cont 5 • Repeat 1 to 5 until all branches are done • In this example, we will get this at the end

  46. Downside of Neighbor-joining • Generates only one possible tree • Generates only unrooted tree

  47. Computational Approaches toPhylogenetic Tree Computation • Distance Based Methods • UPGMA • Neighbor joining • Character State Methods • Maximum Parsimony Method • Maximum Likelihood Methods • Tree merging • Consensus trees, super-trees

  48. AAA 0 1 0 AAA 0 1 AAA AGA 0 0 1 1 1 0 2 AAA AAA GGA AGA AAA GGA AAG AAA AGA AAG Maximum Parsimony Method • Parsimony-score: • Number of character-changes (mutations) along the evolutionary tree • (tree containing labels on internal vertices) • Example: Score = 3 Score = 4 Most parsimonious tree:  Tree with minimal parsimony score Minimal Evolution Principle

More Related