520 likes | 532 Views
Dynamic Programming I. HKOI2005 Training (Advanced Group) Liu Chi Man, cx. Prerequisites. Functions Recursion Divide-and-conquer Asymptotic notations – O, . Go, Go, Go!. Recurrence relation Dynamic programming. Recurrence Relation. Base conditions. Recurrence.
E N D
Dynamic Programming I HKOI2005 Training (Advanced Group) Liu Chi Man, cx
Prerequisites • Functions • Recursion • Divide-and-conquer • Asymptotic notations – O,
Go, Go, Go! • Recurrence relation • Dynamic programming
Recurrence Relation Base conditions Recurrence • A mathematical relationship expressing fn as some combination of fi with i < n • Eric W. Weisstein. "Recurrence Relation." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/RecurrenceRelation.html • An equation which defines a sequence recursively • “Recurrence relation." Wikipedia: The Free Encyclopedia. 10 May 2005, 20:00 UTC. 13 May 2005 <http://en.wikipedia.org/wiki/Recurrence_relation>. • Examples • The Fibonacci numbers • F0 = 0; F1 = 1; Fn = Fn-1 + Fn-2 (for n = 2, 3, 4, …) • The sequence is (0, 1, 1, 2, 3, 5, 8, 13, 21, …)
Recurrence Relation • Examples • The positive integers • K1 = 1; Kn = min { KiKn-i } + 1 (for n = 2, 3, 4, …) • The sequence is (1, 2, 3, 4, 5, …) • The product of two factorials • H0,0 = 1; • H0,i = H0,i-1 i (for i = 1, 2, 3, …); • Hi,0 = Hi-1,0 i (for i = 1, 2, 3, …); • Hi,j = Hi-1,j-1 i j (for i = 1, 2, 3, …; j = 1, 2, 3, …) 1≤ i < n
A Shortest Path Problem 1,1 2,1 3,1 9 2 8 4 3 4 2 1 7 6 S 5 2 T 1,2 2,2 3,2 2 2 1 1 3 6 2 1 2 3 1,3 2,3 3,3 3 5 • Consider the network shown below, what is the shortest path distance from S to T?
A Shortest Path Problem • Solutions • Enumeration • How many different paths are there? • 3 3 3 = 27 • Very small~ but what if we have (1,1) up to (7,7)? • 7 7 7 7 7 7 7 = 823543 • Still not too big~ How about (100,100)? • 100100 = (1 googol)2[Note: 1 googol = 10100] • Exponential growth!
A Shortest Path Problem • Solutions • Any shortest path algorithm on general graphs • Dijkstra • Bellman-Ford • Warshall-Floyd • Standard enough, but we can solve this problem more efficiently by exploiting some of its special properties • The graph shown is a so-called “layered network” • Each arrow goes from one layer to the next layer
A Shortest Path Problem • Notation (for convenience) • c(u,v) = the cost of the arrow from node u to node v • Definition • D(k) = the shortest path distance from S to node k • Base condition • D(S) = 0 • Recurrences • D(1,1) = D(S) + c(S, (1,1)); • D(1,2) = D(S) + c(S, (1,2)); • D(1,3) = D(S) + c(S, (1,3));
A Shortest Path Problem • Recurrences • D(2,1) = min { D(1,1) + c((1,1), (2,1)), D(1,2) + c((1,2), (2,1)), D(1,3) + c((1,3), (2,1)) }; • D(2,2) = min { D(1,1) + c((1,1), (2,2)), D(1,2) + c((1,2), (2,2)), D(1,3) + c((1,3), (2,2)) }; • Similar for D(2,3), D(3,1), D(3,2), D(3,3) and D(T) • Final answer • D(T)
A Shortest Path Problem • Running time analysis • For each of the (roughly) 3 3 nodes, we perform • 3 additions • 1 “min” operation on 3 numbers • Let’s generalize to n n nodes • For each of the (roughly) n n nodes, we perform • n additions • 1 “min” operation on n numbers • Overall time complexity = (n3), assuming that the D(.) in min{…} can be retrieved in constant time
Dynamic Programming • A method for reducing the runtime of algorithms exhibiting the properties of overlapping subproblems and optimal substructure • “Dynamic programming." Wikipedia: The Free Encyclopedia. 7 May 2005, 20:01 UTC. 13 May 2005 <http://en.wikipedia.org/wiki/Dynamic_programming>. • In the previous example, why doesn’t our algorithm need to enumerate all possible paths from S to T? • Why we can compute D(2,1) from D(1,1), D(1,2) and D(1,3) so easily?
Optimal Substructure optimal Problem Solution optimal Subproblem Subsolution Subproblem Subsolution optimal • A problem is said to have optimal substructure if its optimal solution can be constructed efficiently from optimal solutions to its subproblems • Optimal substructure." Wikipedia: The Free Encyclopedia. 25 Mar 2005, 00:48 UTC. 13 May 2005 <http://en.wikipedia.org/wiki/Optimal_substructure>. • In other words, the subsolution of an optimal solution is an optimal solution of the corresponding subproblem
Optimal Substructure Q’ a x y b Q • A subpath of a shortest path is a again a shortest path (between its two endpoints) • Proof: • Suppose P is a shortest path from a to b. Let Q be any subpath of P. Q goes from x to y. • What if Q is not a shortest path from x to y? That means there exists a shorter path Q’. • The blue path is shorter than P, which is a contradiction to the assumption “P is a shortest path from a to b”. Hence Q is a shortest path from x to y
Optimal Substructure • Thus the shortest path problem exhibits an optimal substructure • That’s why we can compute D(2,1) from D(1,1), D(1,2) and D(1,3) • Suppose we now want to compute D(m,k) • A path from S to (m,k) must path through either one of (m-1,1), (m-1,2), … , (m-1,n) • This implies that in particular, a SHORTEST path from S to (m,k) must path through one of those n nodes • But which one??
Optimal Substructure • How long is the shortest path from S to (m,k) passing through (m-1,i)? • From the optimal substructure, we know that the distance is D(m-1,i) + c((m-1,i), (m,k)) • D(m,k) is the minimum of these n distances, and we have obtained our recurrence
Optimal Substructure d e c s b • Some problems do not exhibit optimal substructures • For example, finding a longest simple path in a graph between two nodes • A longest simple path from s to e is sbcde • Is it true that bc is a longest simple path from b to c?
Optimal Substructure • Let’s try (and fail) to set up a recurrence relation for the longest simple path problem • Definition • L(v) = the longest simple path distance from node s to node v • Base condition • L(s) = 0 • For the recurrence part, we use our previous reasoning…
Optimal Substructure • Let u1, u2, …, uk be the nodes preceding v • Then a longest simple path from s to v must pass through one of u1, u2, …, uk • How long is the longest simple path from s to v passing through u1? • L(u1) + 1 … Wait! What’s wrong? • v may lie on the longest simple path from s to u1 • We cannot set up a recurrence easily • This is because of the absence of an optimal substructure
Overlapping Subproblems • Two subproblems may share a smaller subproblem • Consider our shortest path problem • Two of the subproblems are • Compute D(3,1) • Compute D(3,2) • They share some common subproblems • Compute D(2,1) • Compute D(1,3) • etc.
Overlapping Subproblems • Suppose we want to compute D(2,1) and D(2,2) • Computing D(2,1) requires computing D(1,1), D(1,2) and D(1,3), and each of them requires computing D(S) • Then we compute D(2,2) • Computing D(2,2) requires computing D(1,1), D(1,2) and D(1,3), and each of them requires computing D(S) • Note the repetitions of computations! • Try to compute D(3,1) and D(3,2)
Overlapping Subproblems • We store every D(.) in memory after computing its value • When computing a D(.), we can simply retrieve previously computed D(.)s from memory, avoiding repeated computations of those D(.)s • So, for our shortest path problem, the D(.)s in min{…} can be “computed” in constant time • Our algorithm runs in (n3) time!
How to Solve a Problem by DP? • Determine whether the problem exhibits an optimal substructure and has overlapping subproblems • If so, then it MAY be solvable by DP • Try to formulate a recurrence relation for the optimal solution (this is the most difficult step) • Based on the recurrence relation, design an algorithm that compute the function values in correct order • You are done
How to Solve a Problem by DP? • If you are experienced enough, you may write up the recurrence before you identify any optimal structure or overlapping subproblems • A problem may have many different formulations; some are easier to reach, while the others are easier to implement • Sometimes when you identify a problem as a DP-able problem, you are 90% done • We are going to see a few classical DP examples
Longest Common Subsequence • Given two strings A and B of length n and m respectively, find their longest common subsequence (NOT substring) • Example • A: aabcaabcaadyyyefg • B: cdfehgjaefazadxex • LCS: caaade • Explanation • A: a a b c a a b c a a d y y y e f g • B: c d f e h g j a e f a z a d x e x
Longest Common Subsequence • Solution • Exhaustion • Number of subsequences of A = 2n • Exponential time! • Optimal substructure • A: a a b c a a b c a a d y y y e f g • B: c d f e h g j a e f a z a d x e x • caa is the LCS of caabca and cdfehgjaefa • Proof: similar to the “shortest path” proof • Overlapping subproblems • Obvious
Longest Common Subsequence • Definition • Fi,j = length of LCS of A[1..i] and B[1..j] • Base conditions • Fi,0 = 0 for all i • F0,j = 0 for all j • Recurrence • Fi,j = Fi-1,j-1 + 1 (if A[i] = B[j]) max{ Fi-1,j , Fi,j-1 } (otherwise) • Answer • Fn,m
Longest Common Subsequence • Explanation of the recurrence • If A[i] = B[j], then matching A[i] and B[j] as a pair has no bad effect • A: ????????x??????? • B: ??????????x?????? • How about the blue portion? • I don’t care, but according to the optimal structure, we should use the LCS of the two blue strings • Therefore we have Fi,j = Fi-1,j-1 + 1
Longest Common Subsequence • Explanation of the recurrence • If A[i] B[j], then either A[i] or B[j] (or both) must not appear in a LCS of A[1..i] and B[1..j] • If A[i] does not appear (B[j] MAY appear) • A: ????????x??????? • B: ??????????y?????? • LCS of A[1..i] and B[1..j] = LCS of blue strings • Fi,j = Fi-1,j • If B[j] does not appear (A[i] MAY appear) • A: ????????x??????? • B: ??????????y?????? • LCS of A[1..i] and B[1..j] = LCS of blue strings • Fi,j = Fi,j-1
Longest Common Sequence • Can we compute Fn,m directly? • No, we need to compute Fn-1,m, Fn,m-1, Fn-1,m-1 first • Therefore we must compute the Fs in the “safe” order • For example, we can compute the Fi,js in increasing order of i and j • Now I know the length of the LCS, but how to obtain the LCS? • Exercise
Longest Common Subsequence • Store base conditions in memory for i = 1 to n do for j = 1 to m do Compute Fi,j Store Fi,j in memory The answer is Fn,m • Time complexity = (nm) • Space complexity = (nm) • Can be reduced to (min{n, m}) • Wait for DP2
Stones • There are n piles of stones in a row • The piles have, in order, a1, a2, …, an stones respectively • You can merge two adjacent piles of stones, paying a cost which is equal to the number of stones in the resulting pile • Find the minimum cost to merge all stones into one single pile
Stones • Sample play • 4 1 2 7 5 • 4 3 7 5 • 7 7 5 • 7 12 • 19 • Total cost = 3 + 7 + 12 + 19 = 41 • Does a greedy strategy always work?
Stones • Suppose after some moves, there are m piles left and the numbers of stones in the piles are b1, b2, …, bm • Each bi corresponds to a contiguous sequence of original (unmerged) piles • Example: • 8 7 6 1 2 3 1 2 3 4 3 5 • 15 12 1 2 7 8
Stones • Optimal substructure • Suppose in an optimal (minimum cost) solution, after some steps, there are m piles left • Choose one of these piles, say p • Suppose p corresponds to original piles s, s+1, …, t • We claim that in the optimal solution, the sequence of moves that transform original piles s, s+1, …, t into p is an optimal solution to the subproblem [as, as+1, …, at] • The proof is trivial
Stones • Overlapping subproblems • Obvious • Definition • Ci,j = minimum cost to merge original piles i, i+1, …, j-1, j into a single pile • Base conditions • Ci,i = 0 for all i • Recurrence (for i < j) • Ci,j = min { Ci,k + Ck+1,j + ax } • The answer is C1,n j i ≤ k < j x = i
Stones • Explanation of the recurrence • To merge original piles i, i+1, …, j, there must be a (final) step in which we merge two piles into one • Let the two piles be p and q, where p corresponds to original piles i, i+1, …, k and q corresponds to original piles k+1, k+2, …, j • The cost to merge p and q = ax • The minimum cost to construct p = Ci,k • The minimum cost to construct q = Ck+1,j • By the optimal substructure, the total cost is Ci,k + Ck+1,j + ax j x = i j x = i
Stones • In what order we should compute Ci,j? • Increasing i and j? No! • Recall: what does Ci,j depend on? • We can compute Ci,j in increasing (j-i) order We are computing this and we need these ... . . . . . i i+1 i+2 i+3 i+4 j-2 j-1 j
Stones • To avoid all the fuss in the previous slide, we may use a top-down implementation • function getC(i, j) { if Ci,j is already computed, Return Ci,j otherwise Compute Ci,j using getC() calls Store Ci,j in memory Return Ci,j } • No need to care about the order of computations!
Stones • Time complexity = (n3) • By using a really clever trick (quadrangle inequality) we can modify the algorithm to run in (n2), but that’s too far from our discussion • Space complexity = (n2) • You may try another formulation • Define C’i,h = minimum cost to merge original piles i, i+1, i+2, …, i+h-1 into a single pile • What are the advantages and disadvantages of this formulation?
Speeding Up Computation of R.F. • The computation of some recursive (mathematical) functions (e.g. the Fibonacci numbers) can be sped up by “dynamic programming” • However, this computation does not exhibit an optimal substructure • In fact, optimality doesn’t mean anything here • We make use of memo(r)ization (storing computed values in memory) to deal with overlapping subproblems
Memo(r)ization • To be precise, we use the term memoization or memorization to refer to the method for speeding up computations by storing previously computed results in memory • The term memoization comes from the noun memo • The term dynamic programming refers to the process of setting up and evaluating a recurrence relation efficiently by employing memo(r)ization
Memo(r)ization • However, in reality, memo(r)ization is often replaced by dynamic programming • Let’s start the good(?) practice in HKOI • HKOI DP Revolution of 2005 • We: Orz
0-1 Knapsack • The notion of subproblems is not very obvious in some problems • There are n items in a shop. The i-th item has weight wi and value vi. A thief has a knapsack which can carry at most a weight of W. What should she steal to maximize the total value of the stolen items? • Assumption: all numbers in this problem are positive integers • Bonus exercise: what does 0-1 mean?
0-1 Knapsack • What is a subproblem? • Less weight, fewer items • Identify the optimal substructure • Exercise • Definition • Ti,j = maximum value she gains if she is allowed to choose from items 1, 2, …, i and the weight limit is j • T’i,j = maximum value she gains if item i is the largest indexed item chosen and the weight limit is j • Which one is better?
0-1 Knapsack • Base conditions • Ti,0 = 0 for all i • T’i,0 = 0 for all i • Recurrence (for i, j > 0) • Ti,j = max { Ti-1,j, Ti-1,j-wi+ vi } (if wi ≤ j) Ti-1,j (otherwise) • T’i,j = max { Tk,j-wi+ vi } (if wi < j) vi (if wi = j) 0 (otherwise) 1 ≤ k < i, Tk,j-wi> 0
0-1 Knapsack • Time complexity • T version – O(nW) • T’ version – O(n2W) • T’ is much more difficult to formulate and results in a higher time complexity • This shows that the definition of the optimal function should be chosen carefully • Again, it is your exercise to give an algorithm to construct the optimal set of chosen items
0-1 Knapsack • In the Branch-and-Bound lecture I was told that the 0-1 Knapsack problem is NP-complete, but why there exists a polynomial-time DP solution? • Polynomial in which variables? • What assumptions have we made in the problem statement?
Phidias • An easy problem at IOI2004 • Phidias (an ancient Greek sculptor) has a big rectangle of size WH. He wants to cut the rectangle into small rectangles. Each small rectangle should have size W1H1, or W2H2, …, or WnHn. If any piece left is not of any of those sizes, then it is wasted. Phidias can only cut straight through a rectangle (parallel to a side of the rectangle), leaving two smaller rectangles. What is the minimum possible wasted area?
Phidias • Subproblem: cutting a smaller rectangle • Optimal substructure: Phidias cut according to an optimal solution. After some cuts, choose any one rectangle. In the optimal solution, this rectangle is going to be cut in a way that the wasted area is at minimum (concerning only this rectangle). • Proof: trivial, because how this rectangle is cut should be independent of other rectangles left