1 / 26

CSE 326: Data Structures Topic 18: The Algorhythmics

CSE 326: Data Structures Topic 18: The Algorhythmics. Luke McDowell Summer Quarter 2003. Today’s Outline. Greedy Divide & Conquer Dynamic Programming. Greedy Algorithms. Repeat until problem is solved: Consider possible next steps Choose best-looking alternative and commit to it

Download Presentation

CSE 326: Data Structures Topic 18: The Algorhythmics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 326: Data StructuresTopic 18: The Algorhythmics Luke McDowell Summer Quarter 2003

  2. Today’s Outline • Greedy • Divide & Conquer • Dynamic Programming

  3. Greedy Algorithms Repeat until problem is solved: • Consider possible next steps • Choose best-looking alternative and commit to it Greedy algorithms are normally fast and simple. Sometimes appropriate as a heuristic solution or to approximate the optimal solution.

  4. Greed in Action • Best First Search • A* Search • Huffman Encodings • Kruskal’s Algorithm • Prim’s Algorithm • Dijkstra’s Algorithm • Scheduling

  5. Items (mostly junk food) Grocery bags Sizes s1, s2,…, sN (0 < si 1) Size of each bag = 1 The Grocery Bagging Problem • You are an environmentally-conscious grocery bagger at QFC • You would like to minimize the total number of bags needed to pack each customer’s items.

  6. Optimal Grocery Bagging: An Example • Example: Items = 0.5, 0.2, 0.7, 0.8, 0.4, 0.1, 0.3 • How may bags of size 1 are required? • Can find optimal solution through exhaustive search • Search all combinations of N items using 1 bag, 2 bags, etc. • Runtime? Only 3 bags needed: (0.2,0.8) (0.3,0.7) (0.1,0.4,0.5) Exponential!

  7. Bin Packing • General problem: Given N items of sizes s1, s2,…, sN (0 < si 1), pack these items in the least number of bins of size 1. • The general bin packing problem is NP-complete • Reductions: All NP-problems  SAT  3SAT  3DM  PARTITION  Bin Packing (see Garey & Johnson, 1979) Items Bins Size of each bin = 1 Sizes s1, s2,…, sN (0 < si 1)

  8. Greedy Solution? • Items = 0.5, 0.2, 0.7, 0.8, 0.4, 0.1, 0.3 FirstFit BestFit Both use at most 1.7M M = optimal number. Best Fit better on average Best = tightest

  9. Another Greedy Solution? • Items = 0.5, 0.2, 0.7, 0.8, 0.4, 0.1, 0.3 FirstFit Decreasing Both use at most 1.2M + 4 Offline algorithm

  10. Divide & Conquer • Divide problem into multiple smaller parts • Solve smaller parts • Solve base cases directly • Otherwise, solve subproblems recursively • Merge solutions together (Conquer!) Often leads to elegant and simple recursive implementations.

  11. Divide & Conquer in Action • MergeSort • QuickSort • buildTree • Closest points • Find median

  12. Fibonacci Numbers F(n) = F(n - 1) + F(n - 2) F(0) = 1 F(1) = 1 Divide & Conquer int fib(int n) { if (n <= 1) return 1; else return fib(n - 1) + fib(n - 2); }

  13. Fibonacci Numbers F(n) = F(n - 1) + F(n - 2) F(0) = 1 F(1) = 1 Divide & Conquer F6 F5 int fib(int n) { if (n <= 1) return 1; else return fib(n - 1) + fib(n - 2); } F4 F4 F3 F3 F2 F3 F2 F2 F1 F2 F1 F1 F0 F2 F1 F1 F0 F1 F0 F1 F0 F1 F0 Runtime: > 1.5N

  14. Fibonacci Numbers F(n) = F(n - 1) + F(n - 2) F(0) = 1 F(1) = 1 Memoized int fib(int n) { // Create a “static” array however your favorite language allows int fibs[n]; if (n <= 1) return 1; if (fibs[n] == 0) fibs[n] = fib(n - 1) + fib(n - 2); return fibs[n]; } O(n) Runtime:

  15. Memoizing/Dynamic Programming • Define problem in terms of smaller subproblems • Solve and record solution for base cases • Build solutions for subproblems up from solutions to smaller subproblems Can improve runtime of divide & conquer algorithms that have shared subproblems with optimal substructure. Usually involves a table of subproblem solutions.

  16. Dynamic Programming in Action • Sequence Alignment • Fibonacci numbers • All pairs shortest path • Optimal Binary Search Tree • Matrix multiplication

  17. Sequence Alignment Biology 101:

  18. DNA Sequence • String using letters (nucleotides): A,C,G,T • For example: ACGGGCATTATCGTA • DNA can mutate! • Change a letter: ACGGGCAT → ACGTGCAT • Insert a letter: ACGGGCAT → ACGGGCAAT • Delete a letter: ACGGGCAT → ACGGGAT • A few mutations makes sequences “different”, but “similar” • Similar sequences often have similar functions

  19. What is Sequence Alignment? • Use underscores (_) or wildcards to match up 2 sequences • The “best alignment” of 2 sequences is an alignment which minimizes the number of “underscores” • For example: ACCCGTTT and TCCCTTT A_CCCGTTT _TCCC_TTT Best alignment: (3 underscores)

  20. Solutions • Naïve solution • Try all possible alignments • Running time: exponential • Dynamic Programming Solution • Create a table • Table(x,y): # errors in best alignment for first x letters of string 1, and first y letters of string 2 • Running time: polynomial

  21. Example Alignment Match ACCGTTAG with ACTGTTAA (1) match ‘G’ with ‘_’: 1 + align(ACCGTTA,ACTGTTAA) (2) match ‘A’ with ‘_’: 1 + align(ACCGTTAG,ACTGTTA) • Suppose we have already determined the best alignment for: • First x letters of string1 with first y-1 letters of string2 • First x-1 letters of string1 with first y-1 letters of string2 • First x-1 letters of string1 with first y letters of string2 If (string1[x] == string2[y]) then TABLE[x,y] = TABLE[x-1,y-1] Else TABLE[x,y] = min(1+TABLE[x,y-1], 1+TABLE[x-1,y])

  22. Example GGCAT and TGCAA (empty) T G C A A (empty) 0 1 2 3 4 5 G 1 G 2 C 3 A 4 T 5

  23. Example GGCAT and TGCAA (empty) T G C A A (empty) 0 1 2 3 4 5 G 1 2 1 2 3 4 G 2 3 2 3 4 5 C 3 4 3 2 3 4 A 4 5 4 3 2 3 T_GCAA_ _GGCA_T T 5 4 5 4 3 4

  24. Pseudocode (bottom-up) int align(String X, String Y, TABLE[1..x,1..y]) { int i,j; // Initialize top row and leftmost column for (i=1; i<=x, ++i) TABLE[i,1] = i; for (j=1; j<=y; ++j) TABLE[1,j] = j; for (i=2; i<=x, ++i) { for (j=2; j<=y; ++j) { if (X[i] == Y[j]) TABLE[i,j] = TABLE[i-1,j-1] else TABLE[i,j] = min(TABLE[i-1,j], TABLE[i,j-1])+1 } } return TABLE[x,y]; } O(N2) runtime:

  25. Pseudocode (top-down) int align(String X,String Y,TABLE[1..x,1..y]) { Compute TABLE[x-1,y-1] if necessary Compute TABLE[x-1,y] if necessary Compute TABLE[x,y-1] if necessary if (X[x] == Y[y]) TABLE[x,y] = TABLE[x-1,y-1]; else TABLE[x,y] = min(TABLE[x-1,y], TABLE[x,y-1]) + 1; return TABLE[x,y]; }

  26. Dynamic Programming Wrap-Up • Re-use expensive computations • Store optimal solutions to sub-problems in table • Use optimal solutions to sub-problems to determine optimal solution to slightly larger problem

More Related