290 likes | 323 Views
Chapter 5 Character–Based Methods of Phylogenetics. 暨南大學資訊工程學系 黃光璿 (HUANG, Guan-Shieng) 2004/04/05. 5.1 Parsimony. Mutations are exceedingly rare events. The most unlikely events a model invokes, the less likely the model is to be correct.
E N D
Chapter 5Character–Based Methods of Phylogenetics 暨南大學資訊工程學系 黃光璿 (HUANG, Guan-Shieng) 2004/04/05
5.1 Parsimony • Mutations are exceedingly rare events. • The most unlikely events a model invokes, the less likely the model is to be correct. • The fewest number of mutations to explain a state is the most likely to be correct.
Ockham's Razor • the philosophic rule states that entities should not be multiplied unnecessarily
5.1.1 Informative and Uninformative Sites • informative sites • have information to construct a tree • uninformative sites • have no information in the sense of parsimony principle.
A position to be informative must have • at least two different nucleotides • each of these nucleotides to present at least twice.
informative sites • synapomorphy: support the internal branches (true) • homoplasy: acquired as a result of parallel evolution of convergence (false) • 眼睛:humans, flies, mollusks (軟體動物)
5.1.2 Unweighted Parsimony • Every possible tree is considered individually for each informative site. • The tree with the minimum overall costs are reported.
There are several problems: • The number of alternative unrooted trees increases dramatically. • Calculating the number of substitutions invoked by each alternative tree is difficult.
The second problem can be solved by • intersection: if the intersection of the two sets of its children is not empty • union: if it is empty. • The number of unions is the minimum number of substitutions. • For uninformative site, it is the number of different nucleotides minus one.
5.1.4 Weighted Parsimony • Not all mutations are equivalent • Some sequences (e.g., non-coding seq.) are more prone to indel than others. • Functional importance differs from gene to gene. • Subtle substitution biases usually vary between genes and between species. Weights (scoring matrices) can be added to reflect these differences.
5.2 Inferred Ancestral Sequences • Can be derived while constructing the tree. • No missing link! • 如何取樣本? It may be bias.
5.3 Strategies for Faster Searches • The number of different phylogenetic tree grows enormously. • 10 sequences 2M for exhaustive search
參考資料及圖片出處 • Fundamental Concepts of BioinformaticsDan E. Krane and Michael L. Raymer, Benjamin/Cummings, 2003. • Biological Sequence Analysis– Probabilistic models of proteins and nucleic acidsR. Durbin, S. Eddy, A. Krogh, G. Mitchison, Cambridge University Press, 1998. • Biology, by Sylvia S. Mader, 8th edition, McGraw-Hill, 2003.