1 / 31

A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees

A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees. Jos é Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University of Campinas, Brazil. Phylogeny reconstruction methods.

ima
Download Presentation

A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Fully Resolved Consensus Between Fully Resolved Phylogenetic Trees José Augusto Amgarten Quitzau João Meidanis Scylla Bioinformatics, Brazil University of Campinas, Brazil

  2. Phylogeny reconstruction methods • Phylogeny reconstruction methods aim at inferring the phylogenetic tree that best describes the evolutionary history for a set of taxa.

  3. Which tree to choose? • “The field of systematics has been in considerable turmoil as various investigators developed different methods of classification and argued their merits. I guarantee you that no one method or view has all the good points.” Walter M. Fitch – 1984

  4. Consensus as tree constructor • Consensus trees have been used traditionally in tree comparison and calculation of bootstrap values • We propose the use of consensus as a tree constructor • It can be efficiently implemented as long as we keep trees fully resolved

  5. Splits • Every edge in a phylogenetic tree divides the leaves in two subgroups. • Each of these pairs of subgroups are splits of the tree. A B H G C D F E

  6. Tree weight • Our method relies on weighing trees and taking the one with maximum weight • Let the frequency of a split in a collection of trees be the number of trees which contain the split divided by the total number of trees in the collection • Let the weight of an unrooted phylogenetic tree be the product of its splits frequencies

  7. Most probable tree • A most probable tree for a collection of fully resolved phylogenetic trees is a tree that maximizes the weight:

  8. Example

  9. Solution w = 0.0703125

  10. Running time • The tree weight formula can be written as a product of the frequencies of the small subgroups • We designed an algorithm that finds all most probable trees for a given set of fully resolved phylogenetic trees • The complexity of the algorithm is O(l3t2log(lt)),where l is the number of leaves and t is the number of trees

  11. Experiments • Data sets used to test the new method: • Synthetic data: from Gascuel’s LIRMM site • K2P – Kimura 2 Parameter, no MC • K2Pm – Kimura 2 Parameter, with MC • COV – Covarion model, no MC • COVm – Covarion model, with MC • Real data: Ribosomal RNA

  12. Experiments • Programs used to test the new method (19):

  13. Most probable = Median

  14. Reflects general tendency

  15. Results: average split distance • Consensus consistently yields minimum average split distance

  16. May result in better tree

  17. Results: distance to “real” tree • Consensus consistently not worse off than majority of input trees … of input trees

  18. Theoretical foundations A B H G C D F E

  19. All splits of a tree A B A | BCDEFGH H B | ACDEFGH H | ABCDEFG AB | CDEFGH ABCD | EFGH G EFG | ABCDH CD | ABEFGH G | ABCDEFH C | ABDEFGH EF | ABCDGH D | ABCEFGH C F | ABCDEGH D E | ABCDFGH F E

  20. Small subgroup of each split A B A H | BCDEFGH B | ACDEFGH H AB | ABCDEFG | CDEFGH ABCD | EFGH G EFG CD | ABCDH | ABEFGH G C | ABCDEFH | ABDEFGH EF D | ABCDGH C | ABCEFGH F | ABCDEGH D E | ABCDFGH F E

  21. Small subgroups A B H AB ABCD EFG CD G C EF D F E

  22. Maximal clusters (n-trees) A B H AB ABCD EFG CD G C EF D F E

  23. Fundamental theoretical result • The small subgroup set of a phylogenetic tree is always a finite set of n-trees • There are exactly three n-trees in this set, and all n-trees are maximal if and only if the phylogenetic tree is fully resolved ABCD AB CD H A B C D EFG EF G E F

  24. E F G EF GH D ABC Implementation details

  25. a E F G EF GH D ABC Dynamic programming

  26. a b E F G EF GH D ABC Dynamic programming

  27. a b E F G EF GH D ABC Dynamic programming

  28. b a D E E F G DE EF GH D ABC ABC FGH DEF Implementation details L \ba

  29. Implementation details

  30. To Do List • Rooted trees • Polytomies • Non uniform weights for input trees

  31. Acknowledgments • Scylla Bioinformatics and Institute of Computing, Unicamp, for machine time, infrastructure, and support • Brazilian Research Financing Agency CNPq, grant 470420/2004-9

More Related