CS 3343: Analysis of Algorithms

CS 3343: Analysis of Algorithms More Examples on Dynamic Programming

Review of Dynamic Programming • We’ve learned how to use DP to solve • a special shortest path problem • the longest subsequence problem • a general sequence alignment • When should I use dynamic programming? • Theory is a little hard to apply • More examples would help

Two steps to dynamic programming • Formulate the solution as a recurrence relation of solutions to subproblems. • Specify an order to solve the subproblems so you always have what you need.

A special shortest path problem S m G n Each edge has a length (cost). We need to get to G from S. Can only move right or down. Aim: find a path with the minimum total length

Recursive thinking n • Suppose we’ve found the shortest path • It must use one of the two edges: • (m, n-1) to (m, n) Case 1 • (m-1, n) to (m, n) Case 2 • If case 1 • find shortest path from (0, 0) to (m, n-1) • SP(0, 0, m, n-1) + dist(m, n-1, m, n) is the overall shortest path • If case 2 • find shortest path from (0, 0) to (m-1, n) • SP(0, 0, m, n-1) + dist(m, n-1, m, n) is the overall shortest path • We don’t know which case is true • But if we’ve find the two paths, we can compare • Real shortest path is the one with shorter overall length m

F(m-1, n) + dist(m-1, n, m, n) F(m, n) = min F(m, n-1) + dist(m, n-1, m, n) Generalize F(i-1, j) + dist(i-1, j, i, j) F(i, j) = min F(i, j-1) + dist(i, j-1, i, j) i = 1 .. m, j = 1 .. n Boundary condition: i = 0 or j = 0. Easy to figure out manually. Number of subproblems = m * n determines structure of DP table Data dependency determines order to compute (i, j) Recursive formulation Let F(i, j) = SP(0, 0, i, j). => F(m, n) is length of SP from (0, 0) to (m, n) n m

“a” not “the” BCBA = LCS(x, y) x: A B C B D A B y: B D C A B A functional notation, but not a function Longest Common Subsequence • Given two sequences x[1 . . m] and y[1 . . n], find a longest subsequence common to them both.

Find out LCS (x[1..m-1], y[1..n-1]) Find out LCS (x[1..m-1], y[1..n]) Find out LCS (x[1..m], y[1..n-1]) Recursive thinking • Case 1: x[m]=y[n]. There is an optimal LCS that matches x[m] with y[n]. • Case 2: x[m] y[n]. At most one of them is in LCS • Case 2.1: x[m] not in LCS • Case 2.2: y[n] not in LCS m x n y

Recursive thinking • Case 1: x[m]=y[n] • LCS(x, y) = LCS(x[1..m-1], y[1..n-1]) || x[m] • Case 2: x[m]  y[n] • LCS(x, y) = LCS(x[1..m-1], y[1..n]) or LCS(x[1..m], y[1..n-1]), whichever is longer m x n y Reduce both sequences by 1 char concatenate Reduce either sequence by 1 char

Recursive formulation Let c[i, j] be the length of LCS(x[1..i], y[1..j]) => c[m, n] is the length of LCS(x, y) c[m–1, n–1] + 1 if x[m] = y[n], max{c[m–1, n], c[m, n–1]} otherwise. c[m, n] = Generalize c[i–1, j–1] + 1 if x[i] = y[j], max{c[i–1, j], c[i, j–1]} otherwise. c[i, j] = i = 1 .. m j = 1 .. n Boundary condition: i = 0 or j = 0. Easy to figure out manually. Number of subproblems = m * n Order to compute? (i, j)

Another DP example • You work in the fast food business • Your company plans to open up new restaurants in Texas along I-35 • Towns along the highway called t1, t2, …, tn • Restaurants at ti has estimated annual profit pi • No two restaurants can be located within 10 miles of each other due to some regulation • Your boss wants to maximize the total profit • You want a big bonus 10 mile

Brute-force • Each town is either selected or not selected • Test each of the 2n subsets • Eliminate subsets that violate constraints • Compute total profit for each remaining subset • Choose the one with the highest profit • Θ(n 2n)

Natural greedy 1 • Take first town. Then the next town >= 10 miles • Can you give an example that this algorithm doesn’t return the correct solution? 100k 100k 500k

Natural greedy 2 • Almost take a town with the highest profit and are not <10 miles of another selected town • Can you give an example that this algorithm doesn’t return the correct solution? 300k 300k 500k

A DP algorithm • Suppose you’ve already found the optimal solution • It will either include tn or not include tn • Case 1: tn not included in optimal solution • Best solution same as best solution for t1 , …, tn-1 • Case 2: tn included in optimal solution • Best solution is pn + best solution for t1 , …, tj , where j < n is the largest index so that dist(tj, tn) ≥ 10

S(n-1) S(j) + pn j < n & dist (tj, tn) ≥ 10 S(n) = max Generalize S(i-1) S(j) + pi j < i & dist (tj, ti) ≥ 10 S(i) = max Dependency: S j i-1 i Recurrence formulation • Let S(i) be the total profit of the optimal solution when the first i towns are considered(not necessarily selected) • S(n) is the optimal solution to the complete problem Number of sub-problems: n. Boundary condition: S(0) = 0.

Example Distance (mi) • Natural greedy 1: 6 + 3 + 4 + 12 = 25 • Natural greedy 2: 12 + 9 + 3 = 24 100 5 2 2 6 6 3 6 10 7 dummy 7 3 4 12 0 Profit (100k) 6 7 9 8 3 2 4 12 5 3 S(i) 6 7 9 9 12 12 14 26 26 10 Optimal: 26 S(i-1) S(j) + pi j < i & dist (tj, ti) ≥ 10 S(i) = max

Complexity • Time: (nk), where k is the maximum number of towns that are within 10 miles to the left of any town • In the worst case, (n2) • Can be improved to (n) with some preprocessing tricks • Memory: Θ(n)

Knapsack problem • Each item has a value and a weight • Objective: maximize value • Constraint: knapsack has a weight limitation Three versions: 0-1 knapsack problem: take each item or leave it Fractional knapsack problem: items are divisible Unbounded knapsack problem: unlimited supplies of each item. Which one is easiest to solve? We study the 0-1 problem today.

Formal definition (0-1 problem) • Knapsack has weight limit W • Items labeled 1, 2, …, n (arbitrarily) • Items have weights w1, w2, …, wn • Assume all weights are integers • For practical reason, only consider wi < W • Items have values v1, v2, …, vn • Objective: find a subset of items, S, such that iS wi W and iS vi is maximal among all such (feasible) subsets

Naïve algorithms • Enumerate all subsets. • Optimal. But exponential time • Greedy 1: take the item with the largest value • Not optimal • Give an example • Greedy 2: take the item with the largest value/weight ratio • Not optimal • Give an example

A DP algorithm • Suppose you’ve find the optimal solution S • Case 1: item n is included • Case 2: item n is not included Total weight limit: W Total weight limit: W wn wn Find an optimal solution using items 1, 2, …, n-1 with weight limit W - wn Find an optimal solution using items 1, 2, …, n-1 with weight limit W

V[n-1, W-wn] + vn V[n-1, W] V[n, W] = max Generalize V[i-1, w-wi] + vi item i is taken V[i-1, w] item i not taken V[i, w] = max V[i-1, w] if wi > w item i not taken Recursive formulation • Let V[i, w] be the optimal total value when items 1, 2, …, i are considered for a knapsack with weight limit w => V[n, W] is the optimal solution Boundary condition: V[i, 0] = 0, V[0, w] = 0. Number of sub-problems = ?

Example • n = 6 (# of items) • W = 10 (weight limit) • Items (weight, value): 2 2 4 3 3 3 5 6 2 4 6 9

wi V[i-1, w-wi] V[i-1, w] i wi vi 0 0 0 0 0 0 0 0 0 0 0 1 2 2 0 2 4 3 0 0 3 3 3 0 4 5 6 5 6 V[i, w] 0 5 2 4 0 6 6 9 V[i-1, w-wi] + vi item i is taken V[i-1, w] item i not taken max V[i, w] = V[i-1, w] if wi > w item i not taken

2 3 5 3 5 6 8 6 8 9 11 4 6 7 10 12 13 9 13 15 V[i-1, w] if wi > w item i not taken 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 0 0 2 2 3 5 5 5 5 0 0 2 3 5 6 8 0 0 2 3 3 6 9 0 0 4 7 10 0 0 4 4 6 7 10 13 V[i-1, w-wi] + vi item i is taken V[i-1, w] item i not taken max V[i, w] =

3 5 3 5 6 8 6 8 9 11 4 7 10 12 13 9 13 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 0 0 2 2 3 5 5 5 5 0 0 2 3 5 6 8 0 0 2 3 3 6 9 0 0 4 6 7 10 0 0 4 4 6 7 10 13 15 Optimal value: 15 Item: 6, 5, 1 Weight: 6 + 2 + 2 = 10 Value: 9 + 4 + 2 = 15

Time complexity • Θ (nW) • Polynomial? • Pseudo-polynomial • Works well if W is small • Consider following items (weight, value): (10, 5), (15, 6), (20, 5), (18, 6) • Weight limit 35 • Optimal solution: item 2, 4 (value = 12). Iterate: 2^4 = 16 subsets • Dynamic programming: fill up a 4 x 35 = 140 table entries • What’s the problem? • Many entries are unused: no such weight combination • Top-down may be better

A few more examples

Longest increasing subsequence • Given a sequence of numbers 1 2 5 3 2 9 4 9 3 5 6 8 • Find a longest subsequence that is non-decreasing • E.g. 1 2 5 9 • It has to be a subsequence of the original list • It has to in sorted order => It is a subsequence of the sorted list Original list: 1 2 5 3 2 9 4 9 3 5 6 8 LCS: Sorted: 1 2 2 3 3 4 5 5 6 8 9 9 1 2 3 4 5 6 8

Events scheduling problem • A list of events to schedule (or shows to see) • ei has start time si and finishing time fi • Indexed such that fi < fj if i < j • Each event has a value vi • Schedule to make the largest value • You can attend only one event at any time • Very similar to the new restaurant location problem • Sort events according to their finish time • Consider: if the last event is included or not e6 e8 e3 e7 e4 e5 e9 e1 e2 Time

s8 f8 s7 f7 s9 f9 Events scheduling problem • V(i) is the optimal value that can be achieved when the first i events are considered • V(n) = e6 e8 e3 e7 e4 e5 e9 e1 e2 Time V(n-1) en not selected max { V(j) + vn en selected j < n and fj < sn

Coin change problem • Given some denomination of coins (e.g., 2, 5, 7, 10), decide if it is possible to make change for a value (e.g, 13), or minimize the number of coins • Version 1: Unlimited number of coins for each denomination • Unbounded knapsack problem • Version 2: Use each denomination at most once • 0-1 Knapsack problem

Use DP algorithm to solve new problems • Directly map a new problem to a known problem • Modify an algorithm for a similar task • Design your own • Think about the problem recursively • Optimal solution to a larger problem can be computed from the optimal solution of one or more subproblems • These sub-problems can be solved in certain manageable order • Works nicely for naturally ordered data such as strings, trees, some special graphs • Trickier for general graphs • The text book has some very good exercises.

CS 3343: Analysis of Algorithms