200 likes | 232 Views
Final Exam Review. Final exam will have the similar format and requirements as Mid-term exam: Closed book, no computer, no smartphone Calculator is Ok Final exam questions are contained in: Questions in Homework 2 and Programming Assignment 2 Content listed in the following slides.
E N D
Final Exam Review Final exam will have the similar format and requirements as Mid-term exam: Closed book, no computer, no smartphone Calculator is Ok Final exam questions are contained in: Questions in Homework 2 and Programming Assignment 2 Content listed in the following slides
String Similarity • How similar are two strings? • ocurrance • occurrence o c u r r a n c e - o c c u r r e n c e 6 mismatches, 1 gap o c - u r r a n c e o c c u r r e n c e 1 mismatch, 1 gap o c - u r r - a n c e o c c u r r e - n c e 0 mismatches, 3 gaps
Edit Distance • Applications. • Basis for Unix diff. • Speech recognition. • Computational biology. • Edit distance. [Levenshtein 1966, Needleman-Wunsch 1970] • Gap penalty ; mismatch penalty pq. • Cost = sum of gap and mismatch penalties. C T G A C C T A C C T - C T G A C C T A C C T C C T G A C T A C A T C C T G A C - T A C A T TC + GT + AG+ 2CA 2+ CA
Sequence Alignment • Goal: Given two strings X = x1 x2 . . . xm and Y = y1 y2 . . . yn find alignment of minimum cost. • Def. An alignment M is a set of ordered pairs xi-yj such that each item occurs in at most one pair and no crossings. • Def. The pair xi-yj and xi'-yj'cross if i < i', but j > j'. • Ex:CTACCG vs. TACATG.Sol: M = x2-y1, x3-y2, x4-y3, x5-y4, x6-y6. x1 x2 x3 x4 x5 x6 C T A C C - G - T A C A T G y1 y2 y3 y4 y5 y6
Cuts • Def. An s-t cut is a partition (A, B) of V with s A and t B. • Def. The capacity of a cut (A, B) is: 2 5 9 10 15 15 10 4 5 s 3 6 t 8 10 A 15 4 6 10 15 Capacity = 10 + 5 + 15 = 30 4 7 30
Cuts • Def. An s-t cut is a partition (A, B) of V with s A and t B. • Def. The capacity of a cut (A, B) is: 2 5 9 10 15 15 10 4 5 s 3 6 t 8 10 A 15 4 6 10 15 Capacity = 9 + 15 + 8 + 30 = 62 4 7 30
Residual Graph • Original edge: e = (u, v) E. • Flow f(e), capacity c(e). • Residual edge. • "Undo" flow sent. • e = (u, v) and eR = (v, u). • Residual capacity: • Residual graph: Gf = (V, Ef ). • Residual edges with positive residual capacity. • Ef = {e : f(e) < c(e)} {eR : f(e) > 0}. capacity u v 17 6 flow residual capacity u v 11 6 residual capacity
Ford-Fulkerson Algorithm 2 4 4 capacity G : 6 8 10 10 2 10 s 3 5 t 10 9
Augmenting Path Algorithm Augment(f, c, P) { b bottleneck(P) foreach e P { if (e E) f(e) f(e) + b else f(eR) f(eR) - b } return f } forward edge reverse edge Ford-Fulkerson(G, s, t, c) { foreach e E f(e) 0 Gf residual graph while (there exists augmenting path P) { f Augment(f, c, P) update Gf } return f }
Certifiers and Certificates: 3-Satisfiability (3-SAT) SAT.Given a CNF formula , is there a satisfying assignment? Certificate. An assignment of truth values to the n boolean variables. • Certifier. Check that each clause in has at least one true literal. • Ex. • Conclusion. SAT is in NP. instance s certificate t
Subset Sum • SUBSET-SUM.Given natural numbers w1, …, wn and an integer W, is there a subset that adds up to exactly W? • Ex: { 1, 4, 16, 64, 256, 1040, 1041, 1093, 1284, 1344 }, W = 3754. • Yes. 1 + 16 + 64 + 256 + 1040 + 1093 + 1284 = 3754. • Remark. With arithmetic problems, input integers are encoded in binary. Polynomial reduction must be polynomial in binary encoding. • Claim. 3-SAT PSUBSET-SUM. • Pf. Given an instance of 3-SAT, we construct an instance of SUBSET-SUM that has solution iff is satisfiable.
Subset Sum • Construction. Given 3-SAT instance with n variables and k clauses, form 2n + 2k decimal integers, each of n+k digits, as illustrated below. • Claim. is satisfiable iff there exists a subset that sums to W. • Pf. No carries possible. x y z C1 C2 C3 x 1 0 0 0 1 0 100,010 x 1 0 0 1 0 1 100,101 y 0 1 0 1 0 0 10,100 y 0 1 0 0 1 1 10,011 z 0 0 1 1 1 0 1,110 z 0 0 1 0 0 1 1,001 0 0 0 1 0 0 100 0 0 0 2 0 0 200 0 0 0 0 1 0 10 dummies to get clausecolumns to sum to 4 0 0 0 0 2 0 20 0 0 0 0 0 1 1 0 0 0 0 0 2 2 W 1 1 1 4 4 4 111,444
Weighted Vertex Cover • Definition. Given a graph G = (V, E), a vertex cover is a set S V such that each edge in E has at least one end in S. • Weighted vertex cover. Given a graph G with vertex weights, find a vertex cover of minimum weight. (NP hard problem) • all nodes with weight of 1 reduces the problem to standard vertex cover problem. 2 4 2 4 2 9 2 9 weight = 11 weight = 2 + 2 + 4
Pricing Method • Pricing method. Set prices and find vertex cover simultaneously. • Why S is a vertex cover set? (use contradiction to prove) Weighted-Vertex-Cover-Approx(G, w) { foreach e in E pe = 0 while( edge e=(i,j) such that neither i nor j are tight) select such an edge e increase pe as much as possible until i or j tight } S set of all tight nodes return S }
Approximation method: Pricing Method • Pricing method. Each edge must be covered by some vertex. Edge e = (i, j) pays price pe 0 to use vertex i and j. • Fairness. Edges incident to vertex i should pay wiin total. • Lemma. For any vertex cover S and any fair prices pe: epe w(S). • Pf. ▪ 2 4 2 9 each edge e covered byat least one node in S sum fairness inequalitiesfor each node in S
Pricing Method price of edge a-b vertex weight Figure 11.8 Example shows the pricing method does not provide the optimal weighted vertex cover solution
Weighted Vertex Cover: IP Formulation • Weighted vertex cover. Given an undirected graph G = (V, E) with vertex weights wi 0, find a minimum weight subset of nodes S such that every edge is incident to at least one vertex in S. • Integer programming formulation. • Model inclusion of each vertex i using a 0/1 variable xi.Vertex covers in 1-1 correspondence with 0/1 assignments: S = {i V : xi = 1} • Objective function: minimizei wi xi. • Constraints:….. • Must take either i or j: xi + xj 1.
Weighted Vertex Cover: IP Formulation • Weighted vertex cover. Integer programming formulation. • Task: Show the concrete ILP equation set for an example graph.
Weighted Vertex Cover • Weighted vertex cover. Given an undirected graph G = (V, E) with vertex weights wi 0, find a minimum weight subset of nodes S such that every edge is incident to at least one vertex in S. 10 9 A F 6 16 10 B G 7 6 9 C H 3 23 33 D I 7 32 E J 10 total weight = 55