320 likes | 468 Views
e. Minimum Routing Cost Spanning Trees (MRCTs). Dean L. Zeller Kent State University October 4 th , 2005. Routing Load. ( T , e ) Serves as a measurement of the number of vertices that will use an edge for routing purposes. Routing Load, con’t. Let T be a tree
E N D
e Minimum Routing Cost Spanning Trees (MRCTs) Dean L. Zeller Kent State University October 4th, 2005
Routing Load • (T,e) • Serves as a measurement of the number of vertices that will use an edge for routing purposes. Minimum Routing Cost Spanning Trees (MRCTs)
Routing Load, con’t • Let T be a tree • Let X and Y be the subtrees created by removing edge e Minimum Routing Cost Spanning Trees (MRCTs)
Routing Load, con’t Routing load on e X has 3 vertices. Y has 4 vertices. Vertex pairs using e for routing: 14, 15, 16, 17, 24, 25, 26, 27, 34, 35, 36, 37, 41, 42, 43, 51, 52, 53, 61, 62, 63, 71, 72, 73 Minimum Routing Cost Spanning Trees (MRCTs)
Routing Cost • The routing cost of a tree is the sum of the routing load of each edge times its weight. • For a tree T with edge length w, C(T) can be computed in O(n) time. Minimum Routing Cost Spanning Trees (MRCTs)
MRCT Problem • The goal of the MRCT problem is to minimize the overall cost C(T) for all spanning trees among G. Minimum Routing Cost Spanning Trees (MRCTs)
Routing Cost and Sum of Distances “Sum of distances between all pairs of vertices in T” “Sum of the routing load for each edge in T times its weight” Claim: Minimum Routing Cost Spanning Trees (MRCTs)
Proof by Formula sum table columns first sum table rows first Minimum Routing Cost Spanning Trees (MRCTs)
Vertex Pairs v1,v2 v1,v3 v1,v4 … vn-1,vn e1 0 3 0 0 e2 2 0 2 0 e3 5 0 5 0 … em 0 0 0 3 Proof by Formula, con’t Minimum Routing Cost Spanning Trees (MRCTs)
dB(v1,v1) = 0 dB(v2,v1) = 10 dB(v3,v1) = 5 dB(v4,v1) = 13 dB(v5,v2) = 11 dB(v1,v2) = 10 dB(v2,v2) = 0 dB(v3,v2) = 15 dB(v4,v2) = 3 dB(v5,v2) = 1 dB(v1,v3) = 5 dB(v2,v3) = 15 dB(v3,v3) = 0 dB(v4,v3) = 18 dB(v5,v3) = 16 dB(v1,v4) = 13 dB(v2,v4) = 3 dB(v3,v4) = 18 dB(v4,v4) = 0 dB(v5,v4) = 4 dB(v1,v5) = 11 dB(v2,v5) = 1 dB(v3,v5) = 16 dB(v4,v5) = 4 dB(v5,v5) = 0 Proof by Example Minimum Routing Cost Spanning Trees (MRCTs)
Proof by Example, con’t l(T,a) = 2·1·4 = 8 l(T,b) = 2·2·3 = 12 l(T,c) = 2·1·4 = 8 l(T,d) = 2·1·4 = 8 C(T) = 8·5 + 12·10 + 8·3 + 8·1 = 192 Minimum Routing Cost Spanning Trees (MRCTs)
Approximations • Naïve algorithms will find a solution to any problem, given enough time • Time is money – solutions must come quickly. • Oftentimes a “good enough” solution will serve just as well. • By choosing a good starting point, one can obtain a good MRCT approximation. Minimum Routing Cost Spanning Trees (MRCTs)
Theorem • A shortest-paths tree rooted at the median of a graph is a 2-approximation of an MRCT of the graph. Minimum Routing Cost Spanning Trees (MRCTs)
Proof • G = (V,E,w) • r = median of graph G • Y = shortest-paths tree rooted at r Minimum Routing Cost Spanning Trees (MRCTs)
Proof, con’t d(x,y) is a metric… • d(x,y)=0 iff x=y • d(x,y) = d(y,x) • d(x,y) d(x,z) + d(z,y) Minimum Routing Cost Spanning Trees (MRCTs)
Proof, con’t • Since r is the median of G… • Dividing by n… • In a shortest-paths tree… • And thus… • Y is a 2-approximation of an MRCT of G. Minimum Routing Cost Spanning Trees (MRCTs)
Solution Decomposition • Analysis technique used widely in approximation algorithms • Serves as a justification as to why the researchers used this method to find solution • Suppose X is the optimal solution • Decompose X to compose a feasible solution Y • Accuracy of Y depends on how “feasible” is defined. • Y is a good approximation of X by belonging to a restricted subset of feasible solutions Minimum Routing Cost Spanning Trees (MRCTs)
Solution Decomposition, con’t • Cut a tree at the centroid r. All subtrees will have no more than half of the tree’s vertices • Suppose r is also the centroid of the optimal MRCT • Construct a shortest-paths tree Y rooted at r. The routing cost of Y will be at most twice that of Minimum Routing Cost Spanning Trees (MRCTs)
Solution Decomposition, con’t • If u and v are nodes not in the same branch, then • In calculating the total distance for all pairs of nodes on will be counted at least n times ( times from v to others, times from others to v) Minimum Routing Cost Spanning Trees (MRCTs)
Solution Decomposition, con’t • Since… • And… • It follows that… • Thus, Y is a 2-approximation of Minimum Routing Cost Spanning Trees (MRCTs)
Time Requirements • It was recently discovered that a shortest-paths tree can be constructed in O(nlogn+m) time. • The routing cost of a tree can be completed in O(n) time. • Thus, completing the 2-approximation algorithm can be done in O(n2logn+mn) time. Minimum Routing Cost Spanning Trees (MRCTs)
Application – Computational Biology • DNA is represented by a string of characters involving only A, C, G, and T. • Similarity of DNA strands is determined through multiple sequence alignments. • Mutations can be introduced by inserting gaps into the string. Minimum Routing Cost Spanning Trees (MRCTs)
Computational Biology, con’t The second fact of biological sequence comparison Evolutionarily and functionally related molecular strings can differ significantly throughout much of the string and yet preserve the same three-dimensional structure(s). Minimum Routing Cost Spanning Trees (MRCTs)
Computational Biology, con’t • Consider the following three strings: T C C G A T G C C G G A C G T C G A C G + ▲ + + + + • One column and five pairwise matches • Total of eight matches 4 matches 2 matches 2 matches Minimum Routing Cost Spanning Trees (MRCTs)
Computational Biology, con’t • Mutations can be added within strings to create more sequence alignments T C C - G A T - G - C C G G A - C G T C - - G A - C G + ▲ + ▲ ▲ + ▲ • Four column and three pairwise matches • 15 total matches • Goal: find a minimum-cost mutation path to maximize multiple alignments 5 matches 5 matches 5 matches Minimum Routing Cost Spanning Trees (MRCTs)
Computational Biology, con’t • The naïve solution to this problem is O(2nln). • Infeasible for all but the smallest of computational biology problems. • 5 strings, 500 characters in length • 255005 = 1000000000000000 = 1015 • By creating a mutation decision tree, an approximation can be created in a fraction of the time. Minimum Routing Cost Spanning Trees (MRCTs)
Computational Biology, con’t Definition: Let S be a set of strings, and let T be a tree where each node is labeled with a distinct string from S. Then, a multiple alignment M of S is called consistent with T if the induced pairwise alignment of Si and Sj has score D(Si,Sj) for each pair of strings (Si,Sj) that label adjacent nodes in T. Theorem: For any set of strings S and for any tree T whose nodes are labeled by distinct strings of S, we can efficiently find a multiple alignment M(T) of S that is consistent with T. Minimum Routing Cost Spanning Trees (MRCTs)
Computational Biology, con’t • A X X _ Z • A X _ _ Z • A _ X _ Z • A Y _ _ Z • A Y X X Z A multiple alignment of those strings that is consistent with the tree. A tree with its nodes labeled by a (multi)set of strings. Minimum Routing Cost Spanning Trees (MRCTs)
Computational Biology, con’t • The time needed to compute M(T) is dominated by the time to compute k-1 pairwise alignments. If each string has length n, then each pairwise alignment takes time O(n2) and the time to construct M(T) is O(kn2). • Theorem is considered “folklore” in the algorithm research community. Minimum Routing Cost Spanning Trees (MRCTs)
Bibliography • Gusfield, Dan. Algorithms on Strings,Trees, and Sequences. Cambridge, 1997. • Wu, B.Y. & K.M. Chao. Spanning Trees and Optimization Problems. Chapman & Hall/CRC, 2004. Minimum Routing Cost Spanning Trees (MRCTs)