1 / 28

CSE 326: Data Structures: Advanced Topics

CSE 326: Data Structures: Advanced Topics. Lecture 26: Wednesday, March 12 th , 2003. Today. Dynamic programming for ordering matrix multiplication Very similar to Query Optimization in databases String processing Final review. Ordering Matrix Multiplication.

Download Presentation

CSE 326: Data Structures: Advanced Topics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 326: Data Structures: Advanced Topics Lecture 26:Wednesday, March 12th, 2003

  2. Today • Dynamic programming for ordering matrix multiplication • Very similar to Query Optimization in databases • String processing • Final review

  3. Ordering Matrix Multiplication • Need to compute A  B  C  D    =

  4. Ordering Matrix Multiplication • One solution: (A  B)  (C  D):  )(  ( )= =  Cost: (3  2  4) + (4  2  3) + (3  4  3) = 84

  5. Ordering Matrix Multiplication • Anoter solution: (A  (B  C))  D: (  )) ( =  )  =... ( = Cost: (2  4  2) + (3  2  2) + (3  2  3) = 46

  6. Ordering Matrix Multiplication Problem: • Given A1 A2 . . .  An, compute optimal ordering Solution: • Dynamic programming • Compute cost[i][j] • the minimum cost to compute Ai Ai+1 . . .  Aj • Proceed iteratively, increasing the gap = j – i

  7. Ordering Matrix Multiplication /* initialize */ for i = 1 to n-1 do cost[i][i] = 0 /* why ? */ /* dynamic programming */ for gap = 1 to n do { for i = 1 to n – gap do { j = i + gap; c = ; for k = i to j-1 do /* how much would it cost to do (Ai . . .  Ak )  (Ak+1 . . .  Aj) ? */ c = min(c, cost[i][k] + cost[k+1][j] + A[i].rows * A[k].columns * A[j].columns) cost[i][j] = c; } } = A[k+1].rows

  8. Ordering Matrix Multiplication • Running time: O(n3) Important variation: • Database systems do join reordering • A very similar algorithm • Come to CSE 544...

  9. String Matching • The problem • Given a text T[1], T[2], ..., T[n]and a pattern P[1], P[2], ..., P[m] • Find all positions s such that P “occurs” in T at position s:(T[s], T[s+1], ..., T[s+m-1]) = (P[1], ..., P[m]) • Where do we need this ? • text editors (e.g. emacs) • grep • XML processing

  10. String Matching • Example:

  11. Naive String Matching /* initialize */ for i = 1 to n-m do if (T[i], T[i+1], ..., T[i+m-1]) = (P[1], P[2], ..., P[m]) then print i running time: O(mn)

  12. Knuth-Morris-Pratt String Matching • main idea: reuse the work, after a failure fail ! precompute on P reuse !

  13. Knuth-Morris-Pratt String Matching • The Prefix-Function:[q] = the largest k < q s.t.(P[1], P[2], ..., P[k-1]) = (P[q-k+1], P[q-k+2], ..., P[q-1])

  14. [8] = 2 [7] = 1 [6] = 4 [5] = 3 [3] = [2] = [1] = 1 [4] = 2

  15. Knuth-Morris-Pratt String Matching /* compute  */ . . . . /* do the matching */ q = 0; /* q = where we are in P */ for i = 1 to n do { q = q+1; while (q > 1 and P[q] != T[i]) q = [q]; if (P[q] = T[i]) { if (q=m) print(i – m+1); q = q+1; } } Time = O(n) (why ?)

  16. Knuth-Morris-Pratt String Matching /* compute  */ [1] = 0; for q = 2 to m+1 do { k = [q – 1]; while (k > 1 and P[k – 1] != P[q – 1]) k = [k]; if (k> 1 and P[k – 1] = P[q – 1]) then k = k+1; [q] = k; } /* do the matching */ . . . Time = O(m) (why ?) Total running time of KMP algoritm: O(m+n)

  17. Final Review • Basic math • logs, exponents, summations • proof by induction • asymptotic analysis • big-oh, theta, omega • how to estimate running times • need sums • need recurrences

  18. Final Review • Lists, stacks queues • ADT definition • Array, v.s. pointer implementation • variations: headers, doubly linked, etc • Trees: • definitions/terminology (root, parent, child, etc) • relationship between depth and size of a tree • depth is between O(log N) and O(N)

  19. Final Review • Binary Search Trees • basic implementations of find, insert, delete • worst case performance: O(N) • average case performance: O(log N) (inserts only) • AVL trees • balance factor +1, 0, -1 • known single and double rotations to keep it balanced • all operations are O(log N) worst case time • Splay trees • good amortized performance • single operation may take O(N) • know the zig-zig, zig-zag, etc • B-trees: know basic idea behind insert/delete

  20. Final Review • Priority Queues • binary heaps: insert/deleteMin, percolate up/down • array implementation • buildheap takes only O(N) !! Used in HeapSort • Binomial queues • merge is fast: O(log N) • insert, deleteMin are based on merge

  21. Final Review • Hashing • hash functions based on the mod function • collision resolution strategies • chaining, linear and quadratic probing, double hashing • load factor of a hash table

  22. Final Review • Sorting • elementary sorting algorithm: bubble sort, selection sort, insertion sort • heapsort O(N log N) • mergesort O(N log N) • quicksort O(N log N) average • fastest in practice, but O(N2) worst case performance • pivot selection – median of the three works well • known which of these are stable and in-place • lower bound on sorting • bucket sort, radix sort • external memory sort

  23. Final Review • Disjoint sets and Union-Find • up-trees and their array-based implementation • know how union-by-size and path compression work • know the running time (not the proof)

  24. Final Review • graph algorithms • adjacency matrix v.s. adjacency list representation • topological sort in O(n+m) time using a queue • Breadth-First-Search (BFS) for unweighted shortest path • Dijkstra’s shortest path algorithm • DFS • minimum spanning trees: Prim, Kruskal

  25. Final Review • Graph algorithms (cont’d) • Euler v.s. Hamiltonian circuits • Know what P, NP and NP-completeness mean

  26. Final Review • Algorithm design techniques • greedy: bin packing • divide and conquer • solving various types of recurrence relations for T(N) • dynamic programming (memoization) • DP-Fibonacci • Ordering matrix multiplication • randomized data structures • treaps • primality testing • string matching • Backtracking and game trees

  27. The Final • Details: • covers chapters 1-10, 12.5, and some extra material • closed book, closed notes except: • you may bring one sheet of notes • time: 1 hour and 50 minutes • Monday, 3/17/2003, 2:30 – 4:20, this room • bring pens/pencils/etc • sleep well the night before

  28. What About Friday ? • I will cover some of the problems on the website • I will take your questions

More Related