1 / 28

The Algorithm for Constructing Phylogenetic Tree

The Algorithm for Constructing Phylogenetic Tree. ---by MYZ. what's the phylogenetic tree. common ancestor. the phylogenetic tree is used to express the evolutionary relationship among species. siamang 合趾猴. hylobatidae 长臂猿. orangutan 猩猩. human 人类. chimpanzee 黑猩猩.

myra-boone
Download Presentation

The Algorithm for Constructing Phylogenetic Tree

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Algorithm for Constructing Phylogenetic Tree ---by MYZ

  2. what's the phylogenetic tree common ancestor the phylogenetic tree is used to express the evolutionary relationship among species siamang 合趾猴 hylobatidae 长臂猿 orangutan 猩猩 human 人类 chimpanzee 黑猩猩 The Evolutionary Tree for Some Primates generally speaking , the phylogenetic tree is a binary tree

  3. the value of the research • 1> Infer evolutionary history • 2> estimate the evolution time of the existing species • 3> exploit the molecular information to offset the shortage of the fossil

  4. The common methods 1 maximum parsimony(最大简约法) maximum likelihood(最大似然法) 2 distance matrix(距离矩阵法) 3

  5. The framework of this presentation introduction of the maximum parsimony introduction of maximum likelihood educe how to use heuristic algorithm copperate with max likelihood explain the distance matrix in detail transform the problem to TSP and then we can use heuristic algorithm and approximation algorithm to construct the phylogenetic tree

  6. GAACTTGT GCACTTGT GCCCTTGT 1 1 1 1 GAAATTGC GAACTTGT GCACTTGT GCCCTTGT GCCATTGT The max parsimony • basic principle : • constructing a phylogenetic tree with minimum amino acid substitution eg: a : G A A A T T G C b : G A A C T T G T c : G C A C T T G T d : G C C C T T G T e : G C C A T T G T max parsimony

  7. 0 t6 t8 6 8 t7 t3 7 t5 t4 t1 t2 5 3 4 1 2 The max likelihood • basic principle : • compute the probability of a particular set of sequences on a given tree and maximizing this probability over all trees. Input: a set of sequences , a given pattern tree Output: the likelihood value of the tree Target: the tree structure wiht max likelihood value max likelihood

  8. 0 t6 t8 6 8 t7 t3 7 t5 t4 t1 t2 5 3 4 1 2 how to compute the likelihood value • the probability of a given set of data arising on a given tree can be computed site by site 1: ATCGGGTGTGTGCAGTGCTG 2: ATGCCTTGTGTGCAGTGCTG 3: ATGCCTTACTGTGCAGTGCT 4: GTCAAATCGTGATCGATAGCT 5: ATGCTAGTTGCTAGCATAGAT L(T | S1) L(T | S2) L(T | Sn) … max likelihood

  9. 0 t6 t8 6 8 t7 t3 7 t5 t4 t1 t2 5 3 4 1 2 The L(T | S[i]) where i , j corresponding to the four bases A T G C is the probability that a lineage which is initially in state i will be in state j after t units of time have elapsed is the prior probability max likelihood

  10. The L(T | S[i]) But in the formula , x0 x6 x7 x8 are unknown variables This expression have 256 terms , in general a tree with n leaves will have n-1 internal nodes and then will have 4^(n-1) terms max likelihood

  11. 0 t6 t8 6 8 t7 t3 7 t5 t4 t1 t2 5 3 4 1 2 The L(T | S[i]) notice that the pattern of parenthese describes an exact relationship of the topology max likelihood

  12. initialize record max value Neighbour soulutions compute L(T|S) no requirement yes output heuristic algorithm L( T | S ) as the fitness function The structure of the tree is the solution our target is get a tree's structure with max likelihood value The number of the trees' structure is max likelihood

  13. 8 8 7 1 1 7 6 6 Y X X 2 2 3 5 5 3 4 4 distance matrix-------Neighbour joining Neighbour joining seeks to build a tree which minimizes the sum of all branch lengths distance matrix

  14. step1 : obtain a distance table of each pair sequences Jukes-Cautor single parameter model Kimura double parametes model distance matrix of five sequences

  15. 1 3 4 6 2 5 step 2: select the min distance and merge nodes distance matrix of five sequences so the select the node 1 and 2 as branch add the 6 to the structure and compute the distance of 6 to each nodes in the meantime , creat a new nodes 6 as the parent of the 1&2 distance matrix

  16. 1 3 4 6 2 5 step 3: the disatance of new node to remaining nodes if we select two nodes i and j with min distance , and then creat a new node x as the parent node , we compute the distance of k to other nodes as follow formula we should also modify the distance of i,j to the x as the length of the branch z is all nodes except i and j

  17. use therate-corrected distance 1 distance matrix

  18. use therate-corrected distance 2 table of rate-corrected distance table of diastance matrix distance matrix

  19. i k i j j summarize the process distance matrix

  20. The pseudo code of the NJ algorithm 1 compute Ai according to 2 while N>2 do 3 for i=0 to m-1 do 4 for j=i+1 to m do 5 compute Mij according to 6 select the min Mij , cluste i j to a new node x 7 compute the Dxk according to 8 modify the branch length of i and j to x according to 9 delete the i and j from the table , add the x to the table 10 N=N-1 11 end of while

  21. merit and demerit maximum parsimony make full use of the information of the nucleotide while there have few species , MP will find the global optimum tree while there have plenty of species , the performance under restrictions maximum likelihood make full use of the information of the nucleotide highly dependent on the nucleotide substitution model the performance is the worst Neighbour joining is the most fast algorithm of all but sometimes get the wrong topology

  22. 2 z z z x 2 1 2 D 2 2 2 y 2 1 D D z x x A 2 1 1 2 2 1 C C y y y x 2 2 2 1 1 2 B B C A A B 2 y x 2 1 Transform the problem to TSP

  23. 6 2 D z x 2 1 D A y 2 1 3 z 6 A 2 1 B y x 5 2 C 2 B C 2 y x 2 1 Transform the problem to TSP add the edges of the unshadow node

  24. 3 6 D B A A 6 5 3 4 7 6 6 C D B 5 C Transform the problem to TSP the circle is one of the hamiltonian circuit of the complete graph

  25. y 6 D x z D D A 6 6 3 D 6 x y B C X A y B 5 5 C C y C B A B A Transform the problem to TSP now assume that if we get a hamiltonian circuit , can we construct the phylogenetic tree

  26. 6 3 z D B A A 3 6 D 6 5 x 4 7 B 5 C y 6 C C D B A Transform the problem to TSP so the question transform to seek the min hamiltonian circuit of a given complete graph step 1 distance matrix step 2 TSP step 3 construct tree 1 ant colony optimization,ACO 2 particle Swarm Optimization, PSO 3 genetic Algorithm, GA 4 simulated Annealing , SA 5 artificial bee colony algorithm, ABC 6 approximation algorithm, NN,ShortestLink,Insertheuristic

  27. z z 2 2 D x D x 1 2 y C C y 2 1 B A B A Transform the problem to TSP

  28. Thank you ! maoyaozong

More Related