310 likes | 652 Views
Algorithm Design Techniques: Greedy Algorithms. Introduction. Algorithm Design Techniques Design of algorithms Algorithms commonly used to solve problems Greedy, Divide and Conquer, Dynamic Programming, Randomized, Backtracking General approach Examples
Introduction • Algorithm Design Techniques • Design of algorithms • Algorithms commonly used to solve problems • Greedy, Divide and Conquer, Dynamic Programming, Randomized, Backtracking • General approach • Examples • Time and space complexity (where appropriate)
Greedy Algorithms • Choose the best option during each phase • Dijkstra, Prim, Kruskal • Making change • Choose largest bill at each round • Does this always work? • Bad examples where greedy does not work?
Greedy Algorithms • Must have • Greedy-choice property: a globally optimal solution can be arrived at by making a locally optimal choice • Optimal substructure: an optimal solution to a problem contains optimal solutions to its subproblems
Making Change • Greedy choice property • Highest denomination coin < n will reside in solution – if not, it will be replaced by two or more smaller coins which will be more coins and not optimal • This is also true for 1, 7, 10 denominations??? • Optimal substructure • Solution for (n – highest denomination coin) is optimal
Scheduling • Given jobs j1, j2, j3, ..., jn with known running times t1, t2, t3, ..., tn – what is the best way to schedule the jobs to minimize average completion time?
Scheduling j1 j2 j3 j4 15 23 26 36 Average completion time = (15+23+26+36)/4 = 25 j2 j4 j1 j3 3 11 21 36 Average completion time = (3+11+21+36)/4 = 17.75
Scheduling • Greedy-choice property: if shortest job does not go first, the y jobs before it will complete 3 time units faster, but j3 will be postponed by time to complete all jobs before it • Optimal substructure: if shortest job is removed from optimal solution, remaining solution for n-1 jobs is optimal
Optimality Proof • Total cost of a schedule is N ∑(N-k+1)tik k=1 t1 + (t1+t2) + (t1+t2+t3) ... (t1+t2+...+tn) N N (N+1)∑tik -∑k*tik k=1 k=1 • First term independent of ordering, as second term increases, total cost becomes smaller
Scheduling Suppose there is a job ordering such that x > y and tix < tiy Swapping jobs (smaller first) increases second term decreasing total cost Show: xtix + ytiy <ytix + xtiy xtix +ytiy =xtix +ytix + y(tiy -tix) = ytix +xtix+ y(tiy -tix) < ytix +xtix+ x(tiy -tix) = ytix +xtix+ xtiy -xtix = ytix + xtiy
More Scheduling • Multiple processor case • Algorithm?
More Scheduling • Multiple processor case • Algorithm: • order jobs shortest first • schedule jobs round-robin • Minimizing final completion time • When is this useful? • How is this different? • Problem is NP-Complete!
Huffman Codes • 100 ASCII characters • Need ceil(log 100) bits to represent each character • Large file = lots of bits! • Would like to reduce number of bits
Huffman Codes • Idea – encode frequently occurring characters using fewer bits • Need to make sure all characters are distinguishable • 01 = A 0101 = B • 010101 =? AAA, AB, BA • No character code should be a prefix of another character code
Huffman Codes • Goal: find a full binary tree of minimum cost where characters are stored in the leaves • Cost of tree: sum across all characters of the frequency of the character times its depth in the tree • frequently occurring characters should be highest in the tree
Huffman Codes e i sp t s nl
Huffman’s Algorithm • How do we produce a code? • Maintain a forest of trees • weight of a tree is the sum of the frequencies of the leaves • start with C trees to represent each character • weight of each is frequency of that character • Until there is only 1 tree • choose the 2 trees with the smallest weights and merge them by creating a new root and making each tree a right or left subtree • Running time – O (ClogC)
Optimality Proof – Idea • The tree must be full • if it is not, move leaf with no siblings to its parent • Least frequent characters are the deepest nodes • if not, a node can be swapped with an ancestor • Characters at the same depth can be swapped • As trees are merged, optimality holds
Optimality Proof – Idea • Greedy choice property: given x and y -- characters with lowest frequency in alphabet C, there exists an optimal prefix code for C in which the codewords for x and y have the same length and differ only in the last bit • Take an arbitrary optimal prefix code and modify it to make it a tree representing another optimal prefix code such that x and y are sibling leaves of max depth
Optimality Proof – Idea • Optimal substructure: C’ = C – {x, y} U {z} where f[z] = f[x]+f[y] T’ is optimal tree for C’ Replace z in T’ with internal node having x and y as children Result is optimal prefix code for C
Approximate Bin Packing • N items of sizes s1, s2, ..., sN • 0 < si <= 1 • Goal: pack into fewest number of bins of size 1 • NP-complete problem, but we can use greedy algorithms to produce solutions not too far from optimal • Knapsack problem • Examples? • Saving data to external media
Example – Optimal Packing • Input: .2, .5, .4, .7, .1, .3, .8 .3 .5 .8 .7 .4 .2 .1
On-line vs Off-line • On-line • Process one item at a time • Cannot move an item once it is placed • Off-line • Look at all items before you place first item
On-line Algorithms • On-line algorithms cannot guarantee optimal solution • Problem: cannot know when input will end • M small items ½-ε – M large items ½+ε • Can fit into M bins with 1 large and 1 small in each bin • If all small come first, place in M separate bins • If input is only M small items, we have used twice as many bins as necessary • There are inputs that force any on-line bin-packing algorithm to use at least 4/3 the optimal number of bins.
On-line Bin Packing Algorithms • Next fit • First fit • Best fit
On-line Bin Packing Algorithms • Next fit • Algorithm • if item first in bin with last item – place there • else – place in new bin • (.2, .5) (.4) (.7, .1) (.3) (.8) • Running time? • Let M be the optimal number of bins required to pack a list I of items. Then next fit never uses more than 2M bins. • At most, half of the space is wasted (Bj + Bj+1 > 1)
On-line Bin Packing Algorithms • First fit • Algorithm • Scan all bins and place item in first bin large enough to hold it • if no bin is large enough, create new bin • (.2, .5, .1) (.4, .3) (.7) (.8) • Running time? • Let M be the optimal number of bins required to pack a list I of items. Then first fit never uses more than ceil(17/10M) bins.
On-line Bin Packing Algorithms • Best fit • Algorithm • Scan all bins and place item in bin with tightest fit (will be fullest after item is placed there) • if no bin is large enough, create new bin • (.2, .5, .1) (.4) (.7, .3) (.8) • Running time? • Same performance as first fit.
Off-line Bin Packing • Sort items (in decreasing order) first for easier placement of large items • Apply first fit or best fit algorithm • First fit – (.8, .2) (.7, .3) (.5, .4, .1) • Let M be the optimal number of bins required to pack a list I of items. Then first fit decreasing never uses more than (11/9M)+4 bins.