1 / 27

Building Phylogenies

Learn about different methods and algorithms used in building phylogenetic trees using parsimony, including distance-based methods and maximum likelihood. Understand the concept of small and large parsimony, and explore algorithms like Fitch's and Sankoff's for small parsimony reconstruction. Discover the importance of homology, orthology, and paralogy in phylogenetic analysis.

lepaged
Download Presentation

Building Phylogenies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building Phylogenies Parsimony 1

  2. Methods • Distance-based • Parsimony • Maximum likelihood

  3. Note • Some of the following figures come from: • [S05] Swofford http://www.csit.fsu.edu/~swofford/bioinformatics_spring05 • [F05] Felsenstein http://evolution.gs.washington.edu/gs541/2005/

  4. Parsimony methods • Goal:Find the tree that allows evolution of the sequences with the fewest changes. • This is called a most parsimonious (MP) tree • Parsimony is implemented in PAUP* http://paup.csit.fsu.edu/ • Compatibility methods are closely related to parsimony: • Goal: Find tree that perfectly fits the most characters.

  5. G A G A G Evolutionary Steps Steps can have weights

  6. a 0 1 1 1 A B C D b 0 1 1 1 c 0 0 1 1 d 0 1 1 0 e 0 0 0 1 f 1 0 0 0 D A B C Parsimony a, b f c d d e Typically, each site is treated separately

  7. Some numbers Number of unrooted trees on n 2 species: Un = (2n5)(2n7)(2n9) . . . (3)(1), Number of rooted trees on n 3 species: Rn = (2n5) Un

  8. The number of rooted trees [F05]

  9. Small versus Large Parsimony • Parsimony score of a tree: The smallest (weighted) number of steps required by the tree • (Large) Parsimony: Find the tree with the lowest parsimony score • Small Parsimony: Given a tree, find its parsimony score • Small parsimony is by far the easier problem. • Used to solve large parsimony

  10. A DNA data set [F05]

  11. An example tree [F05]

  12. Most parsimonious states for site 1

  13. Most parsimonious states for site 2

  14. Most parsimonious states for site 3

  15. Most parsimonious states for sites 4 and 5

  16. Most parsimonious states for site 6

  17. Evolutionary steps on tree Only one choice of reconstruction at each site is shown 9 steps in all

  18. Algorithms for Small Parsimony • Fitch’s algorithm: • Based on set operations • Evolutionary steps have same weight • Sankoff’s algorithm: • Based on dynamic programming • Allows steps to have different weights • Both algorithms compute the minimum (weighted) number of steps a tree requires at a given site.

  19. Fitch’s Algorithm • Each node v in tree has a set X(v) • If v is a leaf (tip), X(v) is the nucleotide observed at v • if there is ambiguity, X(v) contains all possible nucleotides at v • If v is a node with descendants u and w, • Let Y  X(u) X(w) • If Y make X(v)  Y, • If Y  make X(v)  X(u)X(w) and count one step.

  20. Fitch’s Algorithm: Example [F05]

  21. Sankoff’s Algorithm • Let cij be the cost of going from state i to state j. • E.g., transitions (AG or CT) are more probable than transversions, so give lower weight to transitions • Let Sv(k) be the smallest (weighted) number of steps needed to evolve the subtree at or above node v, given that node v is in state k.

  22. Sankoff’s Algorithm • If v is a leaf (tip) • If v is a node with descendants u and w • The minimum number of (weighted) steps is

  23. Sankoff’s Algorithm: Example

  24. Sankoff’s Algorithm: Traceback

  25. Searching for an MP tree • Exhaustive search (exact) • Branch-and-bound search (exact) • Heuristic search methods • Stepwise addition • Branch swapping • Star decomposition

  26. Homology, orthology, and paralogy • Homology: Similarity attributed to descent from a common ancestor. • Orthologous sequences: Homologous sequences in different species that arose from a common ancestral gene during speciation; may or may not be responsible for a similar function. • Paralogous sequences: Homologous sequences within a single species that arose by gene duplication.

  27. Orthology and Paralogy http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/Orthology.html

More Related