200 likes | 389 Views
生物資訊演算法 Lecture 12. R90725054 呂育恩. Outline. Distance- Based Evolutionary Trees Ultrametric trees Additive trees. Ultrametric Tree. 8. Rooted Every internal node encodes a branching point Every leaf means a species Any number on the path from root to leaf must decrease monotonically.
E N D
生物資訊演算法 Lecture 12 R90725054 呂育恩
Outline • Distance- Based Evolutionary Trees • Ultrametric trees • Additive trees
Ultrametric Tree 8 • Rooted • Every internal node encodes a branching point • Every leaf means a species • Any number on the path from root to leaf must decrease monotonically 6 5 D 5 G A 3 B C E F
D(i,j) = the number on lca(I,j) in T 8 6 5 D 5 G A 3 B C E F Tree Matrix
Problem • Input: a matrix D • Objective: • Determine if D is ultrametric • Construct the ultrametric tree T for D • DEF: D is Ultrametric if there is an ultrametric tree whose corresponding matrix is D • Key: What is the necessary and sufficient condition?
不獨大 • Matrix D 不獨大 if….. • D(i,j), D(i,k), D(j,k) 三者中至少有兩個等於她們的 max! • Such as (A,C,E) (C,D,E) • Such as the previous D • And this holds for any three distinct indices i,j,k
Thm: D is不獨大 D is Ultrametric • Well, this is the easier direction… • There are 2 kinds of relationship for any i,j,k in T D(i,k) = D(j,k) > D(i,j) 不獨大 By the last property of T being ultrametric D(i,j) = D(j,k) = D(i,k) 不獨大c i j k k i j
Thm: D is不獨大 D is Ultrametric D(1,il) • This is harder. We show it by a constructive proof • We re-index species 2….n such that D(1,2)=…=D(1,i1) <D(1, i1+1)=…=D(1,i2) <D(1, i2+1)=…=D(1,i3) ………………….. <D(1,il-1+1)=…=D(1,il) D(1,i2) Il-1+1….il D(1,i1) 1 2….i1 i1+1….i2
On the Correctness of the Recursion • We show its correctness by induction on n • Let T is the ultrametric tree built by our recursion procedure, we claim: • D(i,j) = the number on lca(i,j) in T • If i=1, it is trivial (refer to the original figure) • When i!=1 & j!=1, there are 3 cases:
i , j 1 On the Correctness of the Recursion (cont’) j i 1 i j 1 Case 2: D(i,1) < D(j,1) D(i,j) = D(j,1) (since D 不獨大) D(i,j) = lca( i , j ) Case 3: D(j,1) < D(i,1) D(i,j) = D(i,1) (since D 不獨大) D(i,j) = lca( i , j ) Case 1: Easy, directly from induction hypothesis OK OK OK
Unrooted Every leaf encodes a certain specie Distances are placed on edges Additive Tree A 4 D 2 3 2 1 B C
Problem • Input: a matrix D • Objective: • Determine if D is additive • Construct an additive tree T for an additive matrix D • DEF: A matrix D is additive if there is an additive tree T whose distance matrix is D • Trick: reducing this problem to an ultrametric tree problem
The Reduction D: An additive matrix Transform D’: an ultrametrix matrix • We show that • If D is additive T is an additive tree whose distance matrix is D • If D is not additive D’ is NOT ultrametrix T: an additive tree Restore T^: an ultrametric tree
9 2 7 3 9 1 4 4 2 D B A C Transform 1 And we call this tree T^ • Assumptions • We knows T (which is, of course, not possible) • The largest entry(u) in D exists in 1st row Step 1: Use node 1(A) as root, and then create a new node 1 to root and use u as its edge distance A 4 D 2 3 2 1 B C
Transform 1 (cont’) • Step 2 • Label each internal node v with (u- distancev to root) • And then we create D’ with the values in these internal nodes Clearly, D’(1,j) = u for all j!
Transform 2: We need not T! root D’(i,j) = label of y = u-D(root, y) in T^ (or T) = u- ½(D(i,1) + D(j,1) – D(i,j)) This equation holds also for i or j=1! y: lca(i ,j) in T^ i!=1 j!=1
Now… D is additive D’ is ultrametric • direction is trivial by how we construct D’ from D • However, is a problem….
D’ is ultrametric D is additive D’ is ultrametric x For each T^: ultrametric whose corresponding matrix is D’ X’s label – Y’s label y Distance of i,j on T^ = 2D’(i,j)
Constructing T^ to T • For every leaf i, subtract distancei,parent with (u-D(i,1)) Subtract u-D(i,1) Leaf i Ultrametrics tree, this is exactly node label Distance of i,j on T = distance of i,j on T^ - (u-D(i,1)) – (u-D(j,1)) = 2D’(i,j) – 2u + D(i,1) + D(j,1) = 2u – D(i,1) – D(j,1) + D(i,j) – 2u + D(i,1) + D(j,1) = D(i,j) T is an additive tree and D is the distance matrix of T
Conclusion • Via D’(i,j) = u-1/2(D(i,1)+D(j,1)-D(i,j)) • Resulted D’ satisfies D is additive D’ is ultrametric • If D is additive, then its additive tree T can be obtained by • Given T^, for every leaf i • Subtract distancei,parent with u- D(i,1)