Prof. Swarat Chaudhuri

COMP 482: Design and Analysis of Algorithms Prof. Swarat Chaudhuri Spring 2013 Lecture 7

Recap: Interval Scheduling • Interval scheduling. • Job j starts at sj and finishes at fj. • Two jobs compatible if they don't overlap. • Goal: find maximum subset of mutually compatible jobs. a b c d e f g h Time 0 1 2 3 4 5 6 7 8 9 10 11

Recap: Interval Scheduling • Algorithm. Greedy algorithm; objective is Earliest Finish Time. • Argument for optimality: Greedy stays ahead • Also applicable to Truck Driver’s problem (selection of breakpoints) job ir+1 finishes before jr+1 Greedy: i1 i1 ir ir+1 OPT: j1 j2 jr jr+1 . . . why not replace job jr+1with job ir+1?

Variant: 24-7 Interval Scheduling • You have a processor that can operate 24-7. People submit requests to run daily jobs on the processor. Each such job comes with a start time and an end time; if the job is accepted it must run continuously for the period between the start and end times, EVERY DAY. (Note that some jobs can start before midnight and end after midnight.) • Given a list of n such jobs, your goal is to accept as many jobs as possible (regardless of length), subject to the constraint that the processor can run at most one job at any given point of time. Give an algorithm to do this. • For example, here you have four jobs (6pm, 6am), (9pm, 4am), (3am, 2pm), (1pm, 7pm). • The optimal solution is to pick the second and fourth jobs.

Solution • Let I1,…,In be the n intervals. We call an Ij-restricted solution one that contains the interval Ij. • Here’s an algorithm, for fixed j, to compute an Ij-restricted solution of maximum size. Let x be a point in Ij. First delete Ij and all intervals that overlap it. The remaining intervals do not contain the point x, so we can cut the timeline at x and produce an instance of the Interval Scheduling Problem from class. This takes O(n) time assuming intervals are sorted by ending time. • Now, the algorithm for the full problem is to compute an Ij-restricted solution of maximum size for each j = 1,…,n. This takes a total of O(n2) time. We now pick the largest of these solutions and claim that it is the optimal. • Why? Consider the optimal solution to the full problem. Suppose this produces a set of intervals S. There must be SOME Ij in S, so the solution is an optimal Ij-restricted solution. But then our algorithm would find it.

Recap: Interval Partitioning • Interval partitioning. • Lecture j starts at sj and finishes at fj. • Goal: find minimum number of classrooms to schedule all lectures so that no two occur at the same time in the same room. • Ex: This schedule uses 4 classrooms to schedule 10 lectures. e j g c d b h a f i 3 3:30 4 4:30 9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 Time

Recap: Interval Partitioning Interval partitioning. • Lecture j starts at sj and finishes at fj. • Goal: find minimum number of classrooms to schedule all lectures so that no two occur at the same time in the same room. Greedy algorithm. Consider lectures in increasing order of start time. Assign lecture to any compatible existing classroom. If not possible, open new classroom. f c d j i b g a h e 3 3:30 4 4:30 9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 Time

Interval Partitioning: Lower Bound on Optimal Solution • Def. The depth of a set of open intervals is the maximum number that contain any given time. • Key observation. Number of classrooms needed by any algorithm  depth. • Argument for optimality. Show that depth  number of classrooms opened by greedy algorithm. f c d j i b g a h e 3 3:30 4 4:30 9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 Time

Greedy Analysis Strategies • Greedy algorithm stays ahead. Show that after each step of the greedy algorithm, its solution is at least as good as any other algorithm's. • Structural. Discover a simple "structural" bound asserting that every possible solution must have a certain value. Then show that your algorithm always achieves this bound. • Exchange argument. Gradually transform any solution to the one found by the greedy algorithm without hurting its quality.

Q1: Coin Changing • Goal. Given currency denominations: 1, 5, 10, 25, 100, devise a method to pay amount to customer using fewest number of coins. • Ex: 34¢. • Cashier's algorithm. At each iteration, add coin of the largest value that does not take us past the amount to be paid. • Ex: $2.89.

Q1: Coin-Changing: Greedy Algorithm • Cashier's algorithm. At each iteration, add coin of the largest value that does not take us past the amount to be paid. • Q1. Is cashier's algorithm optimal? Sort coins denominations by value: c1 < c2 < … < cn. S   while (x  0) { let k be largest integer such that ck x if (k = 0) return "no solution found" x  x - ck S  S  {k} } return S coins selected

Coin-Changing: Analysis of Greedy Algorithm • Theorem. Greed is optimal for U.S. coinage: 1, 5, 10, 25, 100. • Pf. (by induction on x) • Consider optimal way to change ck  x < ck+1 : greedy takes coin k. • We claim that any optimal solution must also take coin k. • if not, it needs enough coins of type c1, …, ck-1to add up to x • table below indicates no optimal solution can do this • Problem reduces to coin-changing x - ck cents, which, by induction, is optimally solved by greedy algorithm. ▪ k ck All optimal solutionsmust satisfy Max value of coins1, 2, …, k-1 in any OPT 1 1 P 4 - 2 5 N  1 4 3 10 N + D2 4 + 5 = 9 4 25 Q 3 20 + 4 = 24 5 100 no limit 75 + 24 = 99

Coin-Changing: Analysis of Greedy Algorithm • Observation. Greedy algorithm is sub-optimal for US postal denominations: 1, 10, 21, 34, 70, 100, 350, 1225, 1500. • Counterexample. 140¢. • Greedy: 100, 34, 1, 1, 1, 1, 1, 1. • Optimal: 70, 70.

Greedy Analysis Strategies • Greedy algorithm stays ahead. Show that after each step of the greedy algorithm, its solution is at least as good as any other algorithm's. • Structural. Discover a simple "structural" bound asserting that every possible solution must have a certain value. Then show that your algorithm always achieves this bound. • Exchange argument. Gradually transform any solution to the one found by the greedy algorithm without hurting its quality.

4.2 Scheduling to Minimize Lateness

1 2 3 4 5 6 tj 3 2 1 4 3 2 dj 6 8 9 9 14 15 Scheduling to Minimizing Lateness • Minimizing lateness problem. • Single resource processes one job at a time. • Job j requires tj units of processing time and is due at time dj. • If j starts at time sj, it finishes at time fj = sj + tj. • Lateness: j = max { 0, fj - dj }. • Goal: schedule all jobs to minimize maximumlateness L = max j. • Ex: lateness = 2 lateness = 0 max lateness = 6 d3 = 9 d2 = 8 d6 = 15 d1 = 6 d5 = 14 d4 = 9 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11

Minimizing Lateness: Greedy Algorithms • Greedy template. Consider jobs in some order. • [Shortest processing time first] Consider jobs in ascending order of processing time tj. • [Earliest deadline first] Consider jobs in ascending order of deadline dj. • [Smallest slack] Consider jobs in ascending order of slack dj - tj.

1 1 2 2 1 1 10 10 2 100 10 10 Minimizing Lateness: Greedy Algorithms • Greedy template. Consider jobs in some order. • [Shortest processing time first] Consider jobs in ascending order of processing time tj. • [Smallest slack] Consider jobs in ascending order of slack dj - tj. tj counterexample dj tj counterexample dj

d1 = 6 d2 = 8 d3 = 9 d4 = 9 d5 = 14 d6 = 15 Minimizing Lateness: Greedy Algorithm • Greedy algorithm. Earliest deadline first. Sort n jobs by deadline so that d1 d2 …  dn t  0 for j = 1 to n Assign job j to interval [t, t + tj] sj t, fj  t + tj t  t + tj output intervals [sj, fj] max lateness = 1 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11

Minimizing Lateness: No Idle Time • Observation. There exists an optimal schedule with noidle time. • Observation. The greedy schedule has no idle time. d = 4 d = 6 d = 12 0 1 2 3 4 5 6 7 8 9 10 11 d = 4 d = 6 d = 12 0 1 2 3 4 5 6 7 8 9 10 11

Minimizing Lateness: Inversions • Def. An inversion in schedule S is a pair of jobs i and j such that:di < dj but j scheduled before i. • Observation. Greedy schedule has no inversions. • Observation. All schedules with no idle time and no inversions have same maximum lateness. • Observation. If a schedule (with no idle time) has an inversion, it has one with a pair of inverted jobs scheduled consecutively. inversion before swap j i

Minimizing Lateness: Inversions • Def. An inversion in schedule S is a pair of jobs i and j such that:di < dj but j scheduled before i. • Claim. Swapping two adjacent, inverted jobs reduces the number of inversions by one and does not increase the max lateness. • Pf. Let  be the lateness before the swap, and let  ' be it afterwards. • 'k = k for all k  i, j • 'ii • If job j is late: inversion fi before swap j i after swap i j f'j

Minimizing Lateness: Analysis of Greedy Algorithm • Theorem. Greedy schedule S is optimal. • Pf. Define S* to be an optimal schedule that has the fewest number of inversions, and let's see what happens. • Can assume S* has no idle time. • If S* has no inversions, then S = S*. • If S* has an inversion, let i-j be an adjacent inversion. • swapping i and j does not increase the maximum lateness and strictly decreases the number of inversions • this contradicts definition of S* ▪

Q2: Subsequences • Suppose you have a collection of possible events (e.g., possible transactions) and a sequence S of n events. A given event may occur multiple times—e.g., you could have an event “buy Google stock” multiple times in a log of transactions. • A sequence S’ is a subsequence of a sequence S if there’s a way to delete certain events from S such that the remaining sequence equals S’. For example, the reason to do this could be pattern-matching. • Give an algorithm that takes two sequences of events—S’ of length m and S of length n—and decides in time O(m+n) whether S’ is a subsequence of S.

Solution • Greedy algorithm: Let the i-th event of S be S(i). Find the first event in S that matches S’(1), then the second event in S that matches S’(2), and so on. The running time is O(m+n). • It is easy to show that if the algorithm finds a match, then S’ is in fact a subsequence of S. • More difficult direction: if the algorithm does not find a match, then no match exists. • The proof of this is by contradiction. Suppose S’ matches the subsequence S(l1).S(l2)…S(lm). Suppose GREEDY produces the sequence S(k1).S(k2)…. Show that greedy can produce a match all the way up to S(km) and also ki ≤ li for all i. This is done in a way similar to the proof in interval scheduling.

Prof. Swarat Chaudhuri