660 likes | 959 Views
COSC 3101A - Design and Analysis of Algorithms 8. Elements of DP Memoization Longest Common Subsequence Greedy Algorithms. Many of these slides are taken from Monica Nicolescu, Univ. of Nevada, Reno, monica@cs.unr.edu. Elements of Dynamic Programming. Optimal Substructure
E N D
COSC 3101A - Design and Analysis of Algorithms8 Elements of DP Memoization Longest Common Subsequence Greedy Algorithms Many of these slides are taken from Monica Nicolescu, Univ. of Nevada, Reno, monica@cs.unr.edu
Elements of Dynamic Programming • Optimal Substructure • An optimal solution to a problem contains within it an optimal solution to subproblems • Optimal solution to the entire problem is build in a bottom-up manner from optimal solutions to subproblems • Overlapping Subproblems • If a recursive algorithm revisits the same subproblems over and over the problem has overlapping subproblems COSC3101A
Optimal Substructure - Examples • Assembly line • Fastest way of going through a station j contains: the fastest way of going through station j-1 on either line • Matrix multiplication • Optimal parenthesization of Ai Ai+1 Aj that splits the product between Ak and Ak+1 contains: an optimal solution to the problem of parenthesizing Ai..k and Ak+1..j COSC3101A
Discovering Optimal Substructure • Show that a solution to a problem consists of making a choice that leaves one or more similar problems to be solved • Suppose that for a given problem you are given the choice that leads to an optimal solution • Given this choice determine which subproblems result • Show that the solutions to the subproblems used within the optimal solution must themselves be optimal • Cut-and-paste approach COSC3101A
Parameters of Optimal Substructure • How many subproblems are used in an optimal solution for the original problem • Assembly line: • Matrix multiplication: • How many choices we have in determining which subproblems to use in an optimal solution • Assembly line: • Matrix multiplication: One subproblem (the line that gives best time) Two subproblems (subproducts Ai..k, Ak+1..j) Two choices (line 1 or line 2) j - i choices for k (splitting the product) COSC3101A
Parameters of Optimal Substructure • Intuitively, the running time of a dynamic programming algorithm depends on two factors: • Number of subproblems overall • How many choices we look at for each subproblem • Assembly line • (n) subproblems (n stations) • 2 choices for each subproblem • Matrix multiplication: • (n2) subproblems (1 i j n) • At most n-1 choices (n) overall (n3) overall COSC3101A
Memoization • Top-down approach with the efficiency of typical dynamic programming approach • Maintaining an entry in a table for the solution to each subproblem • memoize the inefficient recursive algorithm • When a subproblem is first encountered its solution is computed and stored in that table • Subsequent “calls” to the subproblem simply look up that value COSC3101A
Memoized Matrix-Chain Alg.: MEMOIZED-MATRIX-CHAIN(p) • n length[p] – 1 • fori 1ton • doforj iton • dom[i, j] • return LOOKUP-CHAIN(p, 1, n) Initialize the m table with large values that indicate whether the values of m[i, j] have been computed Top-down approach COSC3101A
Memoized Matrix-Chain Running time is O(n3) Alg.: LOOKUP-CHAIN(p, i, j) • ifm[i, j] < • thenreturnm[i, j] • ifi = j • thenm[i, j] 0 • elsefork itoj – 1 • doq LOOKUP-CHAIN(p, i, k) + LOOKUP-CHAIN(p, k+1, j) + pi-1pkpj • ifq < m[i, j] • thenm[i, j] q • returnm[i, j] COSC3101A
Dynamic Progamming vs. Memoization • Advantages of dynamic programming vs. memoized algorithms • No overhead for recursion, less overhead for maintaining the table • The regular pattern of table accesses may be used to reduce time or space requirements • Advantages of memoized algorithms vs. dynamic programming • Some subproblems do not need to be solved COSC3101A
Matrix-Chain Multiplication - Summary • Both the dynamic programming approach and the memoized algorithm can solve the matrix-chain multiplication problem in O(n3) • Both methods take advantage of the overlapping subproblems property • There are only (n2) different subproblems • Solutions to these problems are computed only once • Without memoization the natural recursive algorithm runs in exponential time COSC3101A
Longest Common Subsequence • Given two sequences X = x1, x2, …, xm Y = y1, y2, …, yn find a maximum length common subsequence (LCS) of X and Y • E.g.: X = A, B, C, B, D, A, B • Subsequences of X: • A subset of elements in the sequence taken in order A, B, D, B, C, D, B, etc. COSC3101A
Example X = A, B, C, B, D, A, B X = A, B, C, B, D, A, B Y = B, D, C, A, B, A Y = B, D, C, A, B, A • B, C, B, A and B, D, A, B are longest common subsequences of X and Y (length = 4) • B, C, A, however is not a LCS of X and Y COSC3101A
Brute-Force Solution • For every subsequence of X, check whether it’s a subsequence of Y • There are 2m subsequences of X to check • Each subsequence takes (n) time to check • scan Y for first letter, from there scan for second, and so on • Running time: (n2m) COSC3101A
Notations • Given a sequence X = x1, x2, …, xm we define the i-th prefix of X, for i = 0, 1, 2, …, m Xi = x1, x2, …, xi • c[i, j] = the length of a LCS of the sequences Xi = x1, x2, …, xi and Yj = y1, y2, …, yj COSC3101A
A Recursive Solution Case 1: xi = yj e.g.: Xi = A, B, D, E Yj = Z, B, E • Append xi = yj to the LCS of Xi-1 and Yj-1 • Must find a LCS of Xi-1 and Yj-1 optimal solution to a problem includes optimal solutions to subproblems c[i - 1, j - 1] + 1 c[i, j] = COSC3101A
A Recursive Solution Case 2: xi yj e.g.: Xi = A, B, D, G Yj = Z, B, D • Must solve two problems • find a LCS of Xi-1 and Yj: Xi-1 = A, B, D and Yj = Z, B, D • find a LCS of Xi and Yj-1: Xi = A, B, D, G and Yj = Z, B • Optimal solution to a problem includes optimal solutions to subproblems max { c[i - 1, j], c[i, j-1] } c[i, j] = COSC3101A
Overlapping Subproblems • To find a LCS of X and Y • we may need to find the LCS between X and Yn-1 and that of Xm-1 and Y • Both the above subproblems has the subproblem of finding the LCS of Xm-1 and Yn-1 • Subproblems share subsubproblems COSC3101A
first second 3. Computing the Length of the LCS 0 if i = 0 or j = 0 c[i, j] = c[i-1, j-1] + 1 if xi = yj max(c[i, j-1], c[i-1, j]) if xi yj 0 1 2 n yj: y1 y2 yn xi 0 x1 1 x2 2 i xm m j COSC3101A
c[i-1,j] c[i,j-1] Additional Information A matrix b[i, j]: • For a subproblem [i, j] it tells us what choice was made to obtain the optimal value • If xi = yj b[i, j] = “ ” • Else, if c[i - 1, j] ≥ c[i, j-1] b[i, j] = “ ” else b[i, j] = “ ” 0 if i,j = 0 c[i, j] = c[i-1, j-1] + 1 if xi = yj max(c[i, j-1], c[i-1, j]) if xi yj 0 1 2 3 n b & c: yj: A C D F xi 0 A 1 B 2 i 3 C D m j COSC3101A
LCS-LENGTH(X, Y, m, n) • for i ← 1to m • do c[i, 0] ← 0 • for j ← 0 to n • do c[0, j] ← 0 • for i ← 1to m • do for j ← 1to n • do if xi = yj • then c[i, j] ← c[i - 1, j - 1] + 1 • b[i, j ] ← “ ” • else if c[i - 1, j] ≥ c[i, j - 1] • then c[i, j] ← c[i - 1, j] • b[i, j] ← “↑” • else c[i, j] ← c[i, j - 1] • b[i, j] ← “←” • return c and b The length of the LCS if one of the sequences is empty is zero Case 1: xi = yj Case 2: xi yj Running time: (mn) COSC3101A
If xi = yj b[i, j] = “ ” Else if c[i - 1, j] ≥ c[i, j-1] b[i, j] = “ ” else b[i, j] = “ ” 0 0 0 0 0 0 0 0 1 3 2 3 1 4 4 2 1 1 2 1 0 0 1 1 2 2 2 1 2 2 0 3 1 2 2 3 3 0 1 2 2 3 0 2 2 3 4 0 Example 0 if i = 0 or j = 0 c[i, j] = c[i-1, j-1] + 1 if xi = yj max(c[i, j-1], c[i-1, j]) if xi yj X = B, D, C, A, B, A Y = A, B, C, B, D, A 0 1 2 3 4 5 6 yj B D C A B A 0 xi 1 A 0 0 0 1 2 B 1 1 1 2 3 C 4 B 5 D 6 A 7 B COSC3101A
0 0 0 0 0 0 0 0 1 1 4 3 1 2 4 1 3 2 2 1 0 0 1 1 2 2 2 1 2 2 0 3 1 2 2 3 3 0 1 2 2 3 0 2 2 3 4 0 4. Constructing a LCS • Start at b[m, n] and follow the arrows • When we encounter a “ “ in b[i, j] xi = yj is an element of the LCS 0 1 2 3 4 5 6 yj B D C A B A 0 xi 1 A 0 0 0 1 2 B 1 1 1 2 3 C 4 B 5 D 6 A 7 B COSC3101A
PRINT-LCS(b, X, i, j) • if i = 0 or j = 0 • then return • if b[i, j] = “ ” • then PRINT-LCS(b, X, i - 1, j - 1) • print xi • elseif b[i, j] = “↑” • then PRINT-LCS(b, X, i - 1, j) • else PRINT-LCS(b, X, i, j - 1) Initial call: PRINT-LCS(b, X, length[X], length[Y]) Running time: (m + n) COSC3101A
Improving the Code • What can we say about how each entry c[i, j] is computed? • It depends only on c[i -1, j - 1], c[i - 1, j], and c[i, j - 1] • Eliminate table b and compute in O(1) which of the three values was used to compute c[i, j] • We save (mn) space from table b • However, we do not asymptotically decrease the auxiliary space requirements: still need table c • If we only need the length of the LCS • LCS-LENGTH works only on two rows of c at a time • The row being computed and the previous row • We can reduce the asymptotic space requirements by storing only these two rows COSC3101A
Greedy Algorithms • Similar to dynamic programming, but simpler approach • Also used for optimization problems • Idea: When we have a choice to make, make the one that looks best right now • Make a locally optimal choice in hope of getting a globally optimal solution • Greedy algorithms don’t always yield an optimal solution • When the problem has certain general characteristics, greedy algorithms give optimal solutions COSC3101A
Activity Selection • Schedule nactivities that require exclusive use of a common resource S = {a1, . . . , an} – set of activities • ai needs resource during period [si , fi) • si= start time and fi = finish time of activity ai • 0 si < fi < • Activities ai and aj are compatible if the intervals [si , fi) and [sj, fj) do not overlap fi sj fj si i j j i COSC3101A
Activity Selection Problem Select the largest possible set of nonoverlapping (mutually compatible) activities. E.g.: • Activities are sorted in increasing order of finish times • A subset of mutually compatible activities: {a3, a9, a11} • Maximal set of mutually compatible activities: {a1, a4, a8, a11} and {a2, a4, a9, a11} COSC3101A
Optimal Substructure • Define the space of subproblems: Sij = { ak S : fi ≤ sk < fk ≤ sj } • activities that start after ai finishes and finish before aj starts • Activities that are compatible with the ones in Sij • All activities that finish by fi • All activities that start no earlier than sj COSC3101A
Representing the Problem • Add fictitious activities • a0 = [-, 0) • an+1 = [, “ + 1”) • Range for Sij is 0 i, j n + 1 • In a set Sij we assume that activities are sorted in increasing order of finish times: f0 f1 f2 … fn < fn+1 • What happens if i ≥ j ? • For an activity ak Sij: fi sk < fk sj < fj contradiction with fi fj! Sij = (the set Sij must be empty!) • We only need to consider sets Sij with 0 i < j n + 1 S = S0,n+1entire space of activities COSC3101A
Sij Skj Sik Optimal Substructure • Subproblem: • Select a maximum size subset of mutually compatible activities from set Sij • Assume that a solution to the above subproblem includes activity ak (Sij is non-empty) Solution to Sij = (Solution to Sik) {ak} (Solution to Skj) Solution to Sij = Solution to Sik + 1 + Solution to Skj COSC3101A
Aij Akj Aik Optimal Substructure (cont.) Aij = Optimal solution to Sij • Claim: Sets Aik and Akj must be optimal solutions • Assume Aik’ that includes more activities than Aik Size[Aij’] = Size[Aik’] + 1 + Size[Akj] > Size[Aij] Contradiction: we assumed that Aij is the maximum # of activities taken from Sij COSC3101A
Recursive Solution • Any optimal solution (associated with a set Sij) contains within it optimal solutions to subproblems Sik and Skj • c[i, j] = size of maximum-size subset of mutually compatible activities in Sij • If Sij = c[i, j] = 0 (i ≥ j) COSC3101A
Sij Skj Sik Recursive Solution If Sij and if we consider that ak is used in an optimal solution (maximum-size subset of mutually compatible activities of Sij) c[i, j] = c[i,k] + c[k, j] + 1 COSC3101A
Recursive Solution 0 if Sij = c[i, j] = max {c[i,k] + c[k, j] + 1} if Sij • There are j – i – 1 possible values for k • k = i+1, …, j – 1 • ak cannot be ai or aj (from the definition of Sij) Sij = { ak S : fi ≤ sk < fk ≤ sj } • We check all the values and take the best one We could now write a dynamic programming algorithm i < k < j ak Sij COSC3101A
Theorem Let Sij and am be the activity in Sij with the earliest finish time: fm = min { fk: ak Sij } Then: • am is used in some maximum-size subset of mutually compatible activities of Sij • There exists some optimal solution that contains am • Sim = • Choosing am leaves Smj the only nonempty subproblem COSC3101A
Smj Sim Proof • Assume ak Sim fi sk < fk sm < fm fk < fm contradiction ! am did not have the earliest finish time There is no ak Sim Sim = Sij sm fm ak am COSC3101A
sm fm am fk Proof : Greedy Choice Property • am is used in some maximum-size subset of mutually compatible activities of Sij • Aij = optimal solution for activity selection from Sij • Order activities in Aij in increasing order of finish time • Let ak be the first activity in Aij = {ak, …} • If ak = am Done! • Otherwise, replace ak with am (resulting in a set Aij’) • since fm fk the activities in Aij’ will continue to be compatible • Aij’ will have the same size with Aij am is used in some maximum-size subset Sij COSC3101A
Why is the Theorem Useful? • Making the greedy choice (the activity with the earliest finish time in Sij) • Reduce the number of subproblems and choices • Solve each subproblem in a top-down fashion 2 subproblems: Sik, Skj 1 subproblem: Smj Sim = 1 choice: the activity with the earliest finish time in Sij j – i – 1 choices COSC3101A
Greedy Approach • To select a maximum size subset of mutually compatible activities from set Sij: • Choose am Sij with earliest finish time (greedy choice) • Add am to the set of activities used in the optimal solution • Solve the same problem for the set Smj • From the theorem • By choosing am we are guaranteed to have used an activity included in an optimal solution We do not need to solve the subproblem Smj before making the choice! • The problem has the GREEDY CHOICE property COSC3101A
Characterizing the Subproblems • The original problem: find the maximum subset of mutually compatible activities for S = S0, n+1 • Activities are sorted by increasing finish time a0, a1, a2, a3, …, an+1 • We always choose an activity with the earliest finish time • Greedy choice maximizes the unscheduled time remaining • Finish time of activities selected is strictly increasing COSC3101A
am fm am fm am fm A Recursive Greedy Algorithm Alg.: REC-ACT-SEL (s, f, i, j) • m ← i + 1 • while m < jandsm < fi►Find first activity in Sij • do m ← m + 1 • if m < j • then return {am} REC-ACT-SEL(s, f, m, j) • else return • Activities are ordered in increasing order of finish time • Running time: (n) – each activity is examined only once • Initial call: REC-ACT-SEL(s, f, 0, n+1) ai fi COSC3101A
Example k sk fk a1 m=1 a0 a2 a1 a3 a1 a4 m=4 a1 a5 a4 a1 a6 a4 a1 a7 a1 a4 a8 m=8 a4 a1 a9 a4 a1 a8 a10 a4 a8 a1 a11 m=11 a4 a1 a8 a11 a4 a8 a1 0 6 9 11 1 2 3 4 5 7 8 10 12 13 14 COSC3101A
An Iterative Greedy Algorithm Alg.: GREEDY-ACTIVITY-SELECTOR(s, f) • n ← length[s] • A ← {a1} • i ← 1 • for m ← 2to n • do if sm ≥ fi ► activity am is compatible with ai • then A ← A {am} • i ← m► ai is most recent addition to A • return A • Assumes that activities are ordered in increasing order of finish time • Running time: (n) – each activity is examined only once am fm am fm am fm ai fi COSC3101A
Steps Toward Our Greedy Solution • Determine the optimal substructure of the problem • Develop a recursive solution • Prove that one of the optimal choices is the greedy choice • Show that all but one of the subproblems resulted by making the greedy choice are empty • Develop a recursive algorithm that implements the greedy strategy • Convert the recursive algorithm to an iterative one COSC3101A
Designing Greedy Algorithms • Cast the optimization problem as one for which: we make a choice and are left with only one subproblem to solve • Prove that there is always an optimal solution to the original problem that makes the greedy choice • Making the greedy choice is always safe • Demonstrate that after making the greedy choice: the greedy choice + an optimal solution to the resulting subproblem leads to an optimal solution COSC3101A
Correctness of Greedy Algorithms • Greedy Choice Property • A globally optimal solution can be arrived at by making a locally optimal (greedy) choice • Optimal Substructure Property • We know that we have arrived at a subproblem by making a greedy choice • Optimal solution to subproblem + greedy choice optimal solution for the original problem COSC3101A
Activity Selection • Greedy Choice Property There exists an optimal solution that includes the greedy choice: • The activity am with the earliest finish time in Sij • Optimal Substructure: If an optimal solution to subproblem Sij includes activity ak it must contain optimal solutions to Sik and Skj Similarly, am + optimal solution to Sim optimal sol. COSC3101A
Dynamic Programming vs. Greedy Algorithms • Dynamic programming • We make a choice at each step • The choice depends on solutions to subproblems • Bottom up solution, from smaller to larger subproblems • Greedy algorithm • Make the greedy choice and THEN • Solve the subproblem arising after the choice is made • The choice we make may depend on previous choices, but not on solutions to subproblems • Top down solution, problems decrease in size COSC3101A
The Knapsack Problem • The 0-1 knapsack problem • A thief rubbing a store finds n items: the i-th item is worth vi dollars and weights wi pounds (vi, wi integers) • The thief can only carry W pounds in his knapsack • Items must be taken entirely or left behind • Which items should the thief take to maximize the value of his load? • The fractional knapsack problem • Similar to above • The thief can take fractions of items COSC3101A