1 / 52

Chapter 5 Character–Based Methods of Phylogenetics

Chapter 5 Character–Based Methods of Phylogenetics. 暨南大學資訊工程學系 黃光璿 (HUANG, Guan-Shieng) 2004/04/05. 5.1 Parsimony. Mutations are exceedingly rate events. The most unlikely events a model invokes, the less likely the model is to be correct.

mandel
Download Presentation

Chapter 5 Character–Based Methods of Phylogenetics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 5Character–Based Methods of Phylogenetics 暨南大學資訊工程學系 黃光璿 (HUANG, Guan-Shieng) 2004/04/05

  2. 5.1 Parsimony • Mutations are exceedingly rate events. • The most unlikely events a model invokes, the less likely the model is to be correct. •  The fewest number of mutations to explain a state is the most likely to be correct.

  3. Ockham's Razor • the philosophic rule that entities should not be multiplied unnecessarily

  4. 5.1.1 Informative and Uninformative Sites

  5. 5.1.1 Informative and Uninformative Sites • informative sites • have information to construct a tree • uninformative sites • have no information in the sense of parsimony principle.

  6. uninformative

  7. uninformative

  8. informative

  9. informative

  10. A position to be informative must have • at least two different nucleotides • each of these nucleotides to present at least twice.

  11. informative sites • synapomorphy: support the internal branches (true) • homoplasy: acquired as a result of parallel evolution of convergence (false) • 眼睛:humans, flies, mollusks (軟體動物)

  12. 5.1.2 Unweighted Parsimony • Every possible tree is considered individually for each informative site. • The tree with the minimum overall costs are reported.

  13. There are several problems: • The number of alternative unrooted trees increases dramatically. • Calculating the number of substitutions invoked by each alternative tree is difficult.

  14. The second problem can be solved by • intersection: if the intersection of the two sets of its children is not empty • union: if it is empty. • The number of unions is the minimum number of substitutions. • For uninformative site, it is the number of different nucleotides minus one.

  15. /* the uth position in the kth sequence */

  16. 5.1.4 Weighted Parsimony • Not all mutations are equivalent • Some sequences (e.g., non-coding seq.) are more prone to indel than others. • Functional importance differs from gene to gene. • Subtle substitution biases usually vary between genes and between species.  Weights (scoring matrices) can be added to reflect these differences.

  17. Calculating the optimal costs

  18. Finding the internal nodes

  19. 5.2 Inferred Ancestral Sequences • Can be derived while constructing the tree. •  No missing link! • 如何取樣本? It may be bias.

  20. 5.3 Strategies for Faster Searches • The number of different phylogenetic tree grows enormously. • 10 sequences  2M for exhaustive search

  21. 5.3.1 Branch and Bound • Provided by Hardy & Penny in 1982. • L: an upper bound (for minimum problem) • obtained from random search or by heuristics (e.g., UPGMA) • Incrementally growing a tree. (branch) • Prune any branch with cost already greater than L. (bound)

  22. Properties • complete search • efficient w.r.t. exhaustive search • 20 sequences are doable.

  23. 5.3.2 Heuristic Searches • local search • Alternative trees are not all independent of each other. • branch swapping (Fig. 5.5) • Properties • not complete, may lose the optimal solution • fast and efficient • local minimal

  24. 5.4 Consensus Trees • Problem • Parsimony approaches may yield more than one trees. • consensus tree • an agreement or a summary of these trees • agree  bifurcation • not agree  multi-furcation

  25. 5.5 Tree Confidence • How much confidence can be attached to the overall tree and its component parts • How much more likely is one tree to be correct than a particular or randomly chosen alternative tree?

  26. 5.5.1 Bootstrap Tests • Randomly choose columns to combine into a new alignment of the same order. • Reconstruct the tree for the new sample. • Repeat (1) (2) for many times. • Consensus the sampled trees w.r.t. the tested one.

  27. Caution • Test based on fewer than several hundred iterations are not reliable. • Underestimate the confidence level at high values and overestimate it at low values. • Some results may appear to be statistically significant by chance simply so many groupings are being considered.

  28. Strategy • doing thousands of iterations • using a correction method to adjust for estimation biases • collapsing branches to multi-furcations • What happens if a tree-building algorithm always produces the same tree?

  29. 5.5.2 Parametric Tests (???) • What is the limit of Parsimony Principle? • especially for distant sequences • the most parsimonious tree v.s. a particular alternative (this can be used to estimate the significance of the built tree)

  30. H. Kishino & M. Hasegawa (1989) • Assume that informative sites within an alignment are both independent and equivalent. • D: difference of minimum number of substitutions invoked by two trees

  31. 5.6 Comparison of Phylogenetic Methods • 用兩種不同的方法, 如果建構出相同的樹, 那麼其正確性就很高.

  32. 5.7 Molecular Phylogenies • Implications • medicine: drug treatment • agriculture: disease resistance factors • conservation (保育): 絕種物種之認定

  33. 5.7.1 The Tree of Life • Carl Woese and his colleagues (1970s) • 16S rRNA (all organisms possess)

  34. 5.7.2 Human Origins • mtDNA • The mean difference between two human populations is about 0.33%. • The greatest differences are found in Alfrica, not across the different continents! •  out-of-Africa theory • mtRNA & Y chromosome are consistent with this hypothesis

  35. They concluded • mitochondrial Eve & Y chromosome Adam • 200’000 years ago

More Related