180 likes | 194 Views
Predicting Evolutionary Ancestors Based on Current Data. Methods of Prediction. Three major types: Maximum likelihood Parsimony Distance matrix. Maximum Likelihood. Scientist provides a model specifying the probabilities of one state changing to a different state
E N D
Methods of Prediction • Three major types: • Maximum likelihood • Parsimony • Distance matrix
Maximum Likelihood • Scientist provides a model specifying the probabilities of one state changing to a different state • Combine all state changes in the tree to produce a likelihood of this tree being the actual evolution • Uses lots of floating point math, is very slow
Parsimony • Each node on the tree represents a split in the development of traits • Trees are evaluated based on the flow of traits across descendants • It is possible for different branches to converge on a trait independently • Traits can revert back to previous states
Example of Parsimony Multi Cellular Hasn’t Evolved Bilateral Symmetry Amoeba Human Sponge Ancestor: single cell organism
Distance Matrix • General case of parsimony tree • Create a tree connecting weighted nodes such that the tree used the minimum possible length of branches • Useful in: • Predicting evolution • Planning delivery routes • Laying out wires/pipes between cities
Assigning States to Parent Nodes Unicellular Asymmetric Multicellular Asymmetric Multicellular Bilateral Symmetry Multicellular Asymmetric Unicellular Asymmetric Amoeba Human Sponge
Program Design • Uses heuristic similar to that of FastDNAml • Doesn’t explore whole tree space • Instead of using maximum likelihood we are evaluating parsimony, the differences between the input data • Master/Slave processing model
Master • Creates initial 2-leaf tree • Generates list of all possible trees with the original 2-leaf base plus a new leaf • Communicates with slaves to evaluate trees • Uses best response as the base to generate the next generation of topologies • Loop until all leaves are used
Slave • Wait for trees from the master • Processes the tree using some algorithm • Return the trees rating to the master • Loop
Tree Processing • Creating the first tree • Communicating tree topologies between processors • Rating tree topologies
Creating the first tree • Master processor compares every input string to every other input string • Strings are evaluated based on common characters at each array position • Master processor builds the first 2-leaf tree from the two closest input strings
Tree Communication • The pointer representation of a tree is translated to a lisp-like parentheses string • Ex. ((Human,Sponge),Amoeba) • Slave processor parses this parentheses string to reproduce the original tree
Rating Trees • Slave only has sequence information for leaves • The slave recursively creates a description for all internal nodes of the tree based on the sequences of children nodes • The slave then recurses through the tree a second time and counts the number of differences across each edge • The tree rating is the total number of differences across all edges