610 likes | 701 Views
CS 361 – Chapter 5. Section 5.1 – “Greedy algorithms” It’s a strategy of solving some problems Need to make a series of choices Each choice is made to maximize current benefit, ignoring the future. Often, some sorting is necessary.
E N D
CS 361 – Chapter 5 • Section 5.1 – “Greedy algorithms” • It’s a strategy of solving some problems • Need to make a series of choices • Each choice is made to maximize current benefit, ignoring the future. • Often, some sorting is necessary. • How do we know a greedy technique works? We try to show that no other algorithm can achieve a better result. • Several examples
(1) Maximizing jobs • Given a set of jobs, with specific start and finish times, what is the maximum # of jobs we can schedule? • Assume that you can’t do 2 jobs at the same time. • Assume time is represented as a whole number, and you may start a job as soon as another is finished. • Try this: Sort list of jobs ascending by finish time. W = set of jobs we’ll do: initialize to empty. for j = 1 to n: if (job j doesn’t conflict with W) add j to W return W
Example • Suppose we have this list of jobs, with (start, finish) times. j1=(1,4) j2=(3,5) j3=(0,6) j4=(4,7) j5=(3,8) j6=(5,9) j7=(6,10) j8=(8,11) • Jobs are already sorted. • Add j1 to W. • We see that j2 and j3 conflict. (They don’t get reconsidered later.) • Add j4 to W. • We see that j5, j6 and j7 conflict. • Finally, add j8 to W.
Optimal • This greedy algorithm produces an optimal schedule. • Recall the goal is to maximize # jobs to do, given fixed start and finish times. • If the algorithm can schedule k jobs, no other algorithm (an optimal solution) would be able to give you k+1. • Proof by contradiction • Let i1, i2, i3, … ik be the set of jobs given by greedy algorithm. • Let j1, j2, j3, … jm be the set of jobs in an optimal solution. (m > k) • Define r: The largest integer where first r jobs in each set match. • The corresponding #r+1 job in each list differ. In particular, jr+1 finishes after ir+1. • In the optimal solution, we can replace jr+1 with ir+1. The set of jobs is still legal and optimal. But we have a contradiction, because now the first r+1 jobs match.
(2) Minimizing rooms • A conference consists of many events: lectures and workshops of various lengths. • Like a job, each event has a known start and finish time. • To simplify the model, again we’ll assume time is expressed in whole numbers. • Produce a schedule that minimizes the number of rooms. • Assume that no 2 jobs may take place at same time in same place. • Inside a room, an event may begin as soon as another one ends. • Similar to previous problem except we have to schedule all events. We are not necessarily interested if 1 room has a maximal number of events taking place inside.
Algorithm Sort list of events ascending by start time. numRooms = 0 for i = 1 to n: if event #i is compatible with some room r schedule event #i in room r. else ++numRooms schedule event #i in the new room return room list schedule • Example e1(0,7) e2(0,3) e3(4,7) e4(4,10) e5(8,11) e6(8,11) e7(10,15) e8(12,15)
Optimal • We can scan the list of events to see what the minimum number of rooms is. At each time t, see how many rooms are needed. • Want to show the greedy algorithm will generate a schedule with this optimal number of rooms. • Let d = # rooms the greedy algorithm schedules. • Room #d is created because event #i conflicted with other d–1 rooms. • Event #i and other d – 1 events at that time all finish after #i’s start time, or else there would have been no conflict. • The other d – 1 events started before event #i. • Therefore, at the time event #i starts, we indeed have d events taking place at the same time. So the optimal schedule must use d rooms.
(3) Fractional knapsack • You discover a treasure chest. But you can only carry away some of the treasure. Each type of treasure has some weight and some value. • Subject to a total weight constraint, need to maximize total value of treasure to haul away. • Formally: • For each item in some set, we have a value vi and a weight wi. We may select xi of the item. • Limits: total weight W you may take away, and available weight wi of each item. • We wish to maximize xi vi / wi. You can think of xi / wi as what proportion of all the wi you are taking. • For example, chest may contain silver and gold; nickels and dimes, etc.
Algorithm We’re given L, list of items to loot. Sort L descending by v_i/w_i. This ratio is a rate, such as value per pound. w = 0 while w < weightLimit: x_i = Take as much of L[0] as you can. w += x_i If L[0] depleted, remove it from list. • Not hard to see that there is no other way to return with more valuable loot given our weight limit. • Final note: what is the complexity of the algorithms we’ve seen today?
Divide & Conquer • (See section 5.2) • Master theorem • A shortcut way of solving many, but not all, recurrences. • The matrix multiplication problem
Master theorem • What is the complexity of an algorithm if we have this type of recurrence: T(n) = a T(n / b) + f(n) • For example, splitting a problem in “a” instances of the problem, each with size 1/b of the original. • 3 cases, basically comparing n log b a with f(n) to see which is “larger”:
How to use • If we are given T(n) = a T(n / b) + f(n) • Compare n log b a with f(n) • Case 1: If n log b a is larger, then T(n) = O(n log b a ). • Case 2: If equal, then T(n) = O(f(n) log2 n). Incidentally, if f(n) is exactly (log2 n)ktimes n log b a T(n) = O(f(n) (log2 n)k+1). • Case 3: If f(n) is larger, then T(n) = O(f(n)). But need to verify that a f(n/b) < f(n) or else master theorem can’t guarantee answer.
Examples • T(n) = 9 T(n/3) + n • T(n) = T((2/3) n) + 1 • T(n) = 3 T(n/4) + n log2 n • T(n) = 4 T(n/2) + n • T(n) = 4 T(n/2) + n2 • T(n) = 4 T(n/2) + n3 • T(n) = 2 T(n/2) + n3 • T(n) = T(9n/10) + n • T(n) = 16T(n/4) + n2 • T(n) = 7T(n/3) + n2 • T(n) = 7T(n/2) + n2 • T(n) = 2T(n/4) + n0.5 • T(n) = 3T(n/2) + n log2 n • T(n) = 2T(n/2) + n/ log2 n • T(n) = 2T(n/2) + n log2n • T(n) = 8T(n/2) + n3(log2n)5
Var substitution • Sometimes the recurrence is not expressed as a fraction of n. (See bottom of page 269.) • Example: T(n) = T(n0.5) + 1 • Let x = log2 n. This means that n = 2x and n0.5 = 2 x/2 • T(n) formula becomes T(2x) = T(2x/2) + 1 • If you don’t like powers of 2 inside T, temporarily define similar function S(x) = T(2x). • Now it is: S(x) = S(x/2) + 1. We know how to solve it. • S(x) = O(log2 x). • Thus, T(2 x) = O(log2 x). Replace x with log2 n. • Final answer: T(n) = O(log2 (log2 n)).
Finding inversions • Two judges have ranked n movies. We want to measure how similar their rankings are. • Approach: • First, arrange the movies in order of one of the judge’s rankings. • Look for numerical “inversions” in the second judge’s list. • Example
How to do it • We’re given (3, 1, 5, 2, 4). Compare with (1, 2, 3, 4, 5). • One obvious solution is O(n2). Look at all possible pairs of rankings and count those that are out of order. • For each # in list, see if subsequent values are smaller. This would be an inversion. • 3 > 1, 2 • 1 > [nothing] • 5 > 2, 4 • 2 > [nothing] • 4 > [nothing] • The answer is 4 inversions. • But we can devise an O(n log2 n) solution that operates like merge sort!
Algorithm • Split up the list in halves until you have singleton sets. • Our merge procedure will • Take as input 2 sorted lists, along with the number of inversions that appeared in each • Return a (sorted) merged list and the number of inversions • Singleton set has no inversions. Return 0. • How to combine: • Let i point to front/smallest value in A. • Let j point to front/smallest value in B. • Which is smaller, A[i] or B[j]? Remove smaller to C. • If it came from B, increment count by |A|. • When one list empty, you are essentially done. • Return # of inversions = count + A.count + B.count
Example • Count inversions in: 5, 4, 7, 3, 2, 6, 1, 8 • Along the way, we find that: • (5, 4) has 1 inversion • (7, 3) has 1 inversion • ((4, 5), (3, 7)) has 2 additional inversions, subtotal 1+1+2 = 4 • ((2, 6), (1, 8)) has 2 additional inversions, subtotal 0+0+2 = 2 • ((3, 4, 5, 7), (1, 2, 6, 8)) has 9 additional inversions, bringing our total to 4 + 2 + 9 = 15. • We can verify that there are in fact 15 inversions. • Interesting that we have an O(n log2 n) algorithm to find inversions, even though our answer can be as high as C(n, 2) which is O(n2). Sorting helped!
Matrix operations • Many numerical problems in science & engineering involve manipulating values in a 2-d arrays called matrices. • Let’s assume dimensions are n x n. • Adding and subtracting matrices is simple: just add corresponding values. O(n2). • Multiplication is more complicated. Here is example: A B C=AB • C[1,2] = A[1,1]*B[1,2] + A[1,2]*B[2,2], etc. • C[i,j] = sum from k=1 to n of A[i,k]*B[k,j]
Multiplication • Matrix multiplication can thus be done in O(n3) time, compared to O(n2) for addition and subtraction. • Actually, we are overstating the complexity, because usually “n” is size of input. The complexities of n2 and n3 here assume we have n2 of input. • But in the realm of linear algebra, “n” means how many rows/columns of data. • How can we devise a divide-and-conquer algorithm for matrix multiplication? • Partition the matrices into 4 quadrants each. • Multiply each quadrant independently.
Example Multiply quadrants this way: Compute the 8 products AE, BG, etc. and then combine results.
Divide/conquer, cont’d • Finish matrix multiplication example • Closest pair problem • Commitment: • Please read section 5.3
Example Multiply quadrants this way: Compute the 8 products AE, BG, etc. and then combine results.
Complexity • The divide-and-conquer approach for matrix multiplication gives us: T(n) = 8 T(n/2) + O(n2). • This works out to a total of O(n3). • No better than classic definition of matrix multiplication. • But, very useful in parallel computation! • Strassen’s algorithm: By doing some messy matrix algebra, it’s possible to compute the n*n product with only 7 quadrant multiplications instead of 8. • T(n) = 7 T(n/2) + O(n2) implies T(n) O(n log 2 7). • Further optimizations exist. Conjecture: matrix multiplication is only slightly over O(n2).
Closest pair • Here is another divide-and-conquer problem. • Given a list of (x,y) points, find which 2 points are closest to each other. • (First, think about how you’d do it in 1 dimension.) • Divide & conquer: repeatedly divide the points into left and right halves. Once you only have a set of 2 or 3 points, finding the closest is easy. • Convenient to have list of points sorted by x & by y. • Combine is a little tricky because it’s possible that the closest pair may straddle the dividing line between left and right halves.
Combining solutions • Given a list of points P, we divided this into a “left” half Q and a right half R. • Thru divide and conquer we now know the closest pair in Q [q1,q2] and closest pair in R [r1,r2], and we can determine which is better. • Let = min(dist(q1,q2),dist(r1,r2)) • But, there might be a pair of points [q,r] with q Q and r R whose dist(q,r) < . How would we find this mysterious pair of points? • These 2 points must be within distance of the vertical line passing thru the rightmost point in Q. Do you see why?
Boundary case • Let’s call S the list of points within horizontal (i.e. “x”) distance of the dividing line L. In effect, a thin vertical strip of territory. • We can restate our earlier observation as follows: There are two points q Q and r R with dist(q,r) < there are two points s1, s2 S with dist(s1,s2) < . • We can restrict our search for the boundary case even more. • If we sort S by y-coordinate, then s1,s2 are within 15 positions of each other on the list. • This means that searching for the 2 closest points in S can be done in O(n) time, even though it’s a nested loop. One loop is bounded by 15 iterations.
Why 15? • Take a closer look at the boundary region. • Let Z be the portion of the plane within x-distance from L. Partition Z into squares of side length /2. We will have many rows of squares, with 4 squares per row. • Each square contains at most 1 point from S. Why? • Suppose one square contains 2 points from S. Both are either in Q or in R, so their distance is at least . But the diagonal of the square is not this long. Contradiction. • s1 and s2 are within 15 positions in S sorted by y. Why? • Suppose separated by 16 or more. Then, there have to be at least 3 rows of squares separating them, so their distance apart must be at least 3/2. However, we had chosen s1 and s2 because their distance was < . Contradiction.
Algorithm closestPair(list of points P): px and py are P sorted by x and y respectively if |px| <= 3, return the simple base case solution Create Q and R: the left & right halves of P. closestQ = closestPair(Q) closestR = closestPair(R) delta = min(dist(closestQ), dist(closestR)) S = points in P within delta of rightmost Q sy = S sorted by y for each s in sy: find dist from s to each of next 15 in sy let [s,s’] be this shortest pair in sy return closest of: closestQ, closestR, or [s,s’]
Dynamic Programming • Alternative to recursion. • Sometimes a recursive algorithm repeats earlier calls. • Replace a recursive algorithm with a bottom-up solution, for sake of efficiency. • Examples: • Fibonacci numbers • Playoff probabilities • Figuring out the best way to multiply several matrices • Commitment: • Please read pp. 278-281.
Fibonacci • fib(n) = fib(n – 1) + fib(n – 2) with base cases f(1) and f(2) • Consider computing fib(10). • We can draw a tree depicting necessary recursive calls. • Repetition, not just with the base cases. • A better way: as you reach each return value, write this value into a global table to be used on later calls with same parameter. • Or better yet, implement fib() bottom up. • Write a loop that starts with low values of n, and works up to 10. Given known values of fib(n – 1) and fib(n – 2), fib(n) is easy.
Playoff winning • Consider the problem of winning a best of 7 series, or any best of (2n+1) series, where the probability of winning any single game is known, say 60%. • Let P(i, j) be the probability of winning the series if you need i more games to win, and opposing team needs j games to win…. where 0 i, j 4. What is P(4,4)? • Base cases: P(0, j) = 1 and P(i, 0) = 0. However, P(0, 0) is undefined. • The series begins. After first game, we go from P(4, 4) to either case P(3, 4) or P(4, 3) with probability 0.6 or 0.4, respectively. So, P(4, 4) = .6 P(3, 4) + .4 P(4, 3). • In general, P(i, j) = 0.6 P(i – 1, j) + 0.4 P(i, j – 1).
Work out probability • P(i, j) = 0.6 P(i – 1, j) + 0.4 P(i, j – 1). • If you work recursively, you encounter many repeated calls that have a lot of work to do, e.g. P(3, 3). • Can work “bottom up” instead of recursively. For empty squares (i, j 1), multiply number above by 0.6 and number to the left by 0.4 and add them.
Matrix chain mult • Famous computational problem: given a set of matrices, in what order should we group them to minimize the number of scalar multiplications? • Matrix multiplication is not commutative, but it is associative. • If matrix A has p rows and q columns, and B has q rows and r columns: • The product AB has p rows and r columns • The number of scalar multiplications is pqr. • These multiplications dominate the total complexity of the matrix multiplication. O(n3).
Multiplying 3 • Suppose we had 3 matrices, A, B, C. Should we multiply them as (AB)C or as A(BC) ? Does it matter? Try this: • A’s dimensions are 10x100 • B’s dimensions are 100x5 • C’s dimensions are 5x50. • Multiply as (AB)C. • AB requires 10*100*5 multiplications, and result is 10x5. • (AB)C requires 10*5*50 multiplications. • Total = 5000 + 2500 = 7500. • Multiply as A(BC). • BC requires 100*5*50 multiplications, and result is 100x50. • A(BC) requires 10*100*50 multiplications. • Total = 25000 + 50000 = 75000.
General problem • Given matrices A1, A2, A3, … An, figure out the optimal parenthesized grouping. • The problem can be characterized simply by stating the list of dimensions d0, d1, d2, d3, … dn, where A1 has d0 rows and d1 columns, etc. • In general: matrix Ai has di – 1 rows and di columns. • Ex. For 6 matrices we could have d = 30, 35, 15, 5, 10, 20, 25. • We want to know the minimum # of scalar multiplications in the product A1 through A6. We can do subproblems: multiplying Ai through Aj. • Let m[i, j] be this minimum number of multiplications.
Recursive calculation • To multiply Ai through Aj you would first multiply Ai through Ak and then multiply Ak+1 through Aj and finally multiply the two products. • m[i , j] = m[i, k] + m[k+1, j] + di – 1 dk dj • For example, we could try k = 5 • This means A2…A8 = (A2…A5)(A6…A8). • (A2…A5) has d1 rows and d5 columns. • (A6…A8) has d5 rows and d8 columns. • But, we don’t know which value of k to use. We must examine all k between i and j to see which gives smallest m[i, j].
Recursive calc (2) • So, we can compute m[i, j] as follows: if i = j, m[i, j] = 0 if i != j, m[i, j] = min (m[i, k] + m[k+1, j] + di – 1 dk dj) among values of k in the range i k < j. • For keeping track of where these values of k where we’d like to split up our calculation, we can define a similar array of values s[i, j]. This is because knowing the optimal value m[i, j] alone doesn’t tell us how to multiply.
Dynamic programming (2) • Finish multiplying several matrices • 0/1 knapsack problem • Commitment: • Please read section 9.4.
Compute m,s bottom up for i = 1 to n: m[i,i] = 0 for length = 2 to n: for i = 1 to (n – length + 1): j = i + length - 1 m[i,j] = Maxint for k = i to j-1: numMult = m[i,k] + m[k+1,j] + d[i-1]d[k]d[j] if numMult < m[i,j] m[i,j] = numMult s[i,j] = k
Little example • Suppose we have 4 matrices. The sequence of dimensions is 4, 5, 3, 2, 4. • Multiplying ABCD could proceed in 3 possible ways • k = 1: (A)(BCD) • k = 2: (AB)(CD) • k = 3: (ABC)(D) • m[1, 4] is the smallest among: • m[1, 1] + m[2, 4] + d0d1d4 • m[1, 2] + m[3, 4] + d0d2d4 • M[1, 3] + m[4, 4] + d0d3d4 • Note that the last term always takes the form di – 1dkdj
Fill in table • Table shows values of m[i, j]. • i > j doesn’t make sense, so we leave these blank • If i = j, m[i, j] = 0 because there’s nothing to multiply • If i + 1 = j, then there is only one way to multiply, so we can immediately compute di – 1dkdj which equals di – 1didi + 1 • So far we have:
Compute m[1, 3] • k varies in this range: i <= k < j • So, in this case, k could be 1 or 2 because there are 2 ways to split the multiplication. • k = 1: m[1, 1] + m[2, 3] + d0d1d3 = 0 + 30 + 4*5*2 = 70 • k = 2: m[1, 2] + m[3, 3] + d0d2d3 = 60 + 0 + 4*3*2 = 84 • What do we do next?
Big Example • Let’s look at d = [ 30, 35, 15, 5, 10, 20, 25 ]. The m table works out to: • How was m[2, 5] calculated? k = 2, 3, or 4: • m[2,2] + m[3,5] + d[1]d[2]d[5] = 0 + 2500 + 35*15*20 = 13000 • m[2,3] + m[4,5] + d[1]d[3]d[5] = 2625 + 1000 + 35*5*20 = 7125 • m[2,4] + m[5,5] + d[1]d[4]d[5] = 4375 + 0 + 35*10*20 = 11375
Knapsack problem • A thief wants to rob a store. Item i has value vi and weight wi. Knapsack has max capacity W. • Want to take as valuable a load as possible. Which items to take? • “0/1” means each item is taken or not. No fractions. • Greedy algorithm doesn’t work. For example (W=5): • Item 1 has value 3, weight 1 (ratio 3.0) • Item 2 has value 5, weight 2 (ratio 2.5) • Item 3 has value 6, weight 3 (ratio 2.0) • Taking items 1+2 (no room for #3): total value 3 + 5 = 8. • Taking items 1+3 (no room for #2): total value 3 + 6 = 9. • Taking items 2+3 (no room for #1): total value 5 + 6 = 11. • Greedy algorithm actually gave worst solution.