280 likes | 392 Views
CSE 326: Data Structures: Advanced Topics. Lecture 26: Wednesday, March 12 th , 2003. Today. Dynamic programming for ordering matrix multiplication Very similar to Query Optimization in databases String processing Final review. Ordering Matrix Multiplication.
E N D
CSE 326: Data Structures: Advanced Topics Lecture 26:Wednesday, March 12th, 2003
Today • Dynamic programming for ordering matrix multiplication • Very similar to Query Optimization in databases • String processing • Final review
Ordering Matrix Multiplication • Need to compute A B C D =
Ordering Matrix Multiplication • One solution: (A B) (C D): )( ( )= = Cost: (3 2 4) + (4 2 3) + (3 4 3) = 84
Ordering Matrix Multiplication • Anoter solution: (A (B C)) D: ( )) ( = ) =... ( = Cost: (2 4 2) + (3 2 2) + (3 2 3) = 46
Ordering Matrix Multiplication Problem: • Given A1 A2 . . . An, compute optimal ordering Solution: • Dynamic programming • Compute cost[i][j] • the minimum cost to compute Ai Ai+1 . . . Aj • Proceed iteratively, increasing the gap = j – i
Ordering Matrix Multiplication /* initialize */ for i = 1 to n-1 do cost[i][i] = 0 /* why ? */ /* dynamic programming */ for gap = 1 to n do { for i = 1 to n – gap do { j = i + gap; c = ; for k = i to j-1 do /* how much would it cost to do (Ai . . . Ak ) (Ak+1 . . . Aj) ? */ c = min(c, cost[i][k] + cost[k+1][j] + A[i].rows * A[k].columns * A[j].columns) cost[i][j] = c; } } = A[k+1].rows
Ordering Matrix Multiplication • Running time: O(n3) Important variation: • Database systems do join reordering • A very similar algorithm • Come to CSE 544...
String Matching • The problem • Given a text T[1], T[2], ..., T[n]and a pattern P[1], P[2], ..., P[m] • Find all positions s such that P “occurs” in T at position s:(T[s], T[s+1], ..., T[s+m-1]) = (P[1], ..., P[m]) • Where do we need this ? • text editors (e.g. emacs) • grep • XML processing
String Matching • Example:
Naive String Matching /* initialize */ for i = 1 to n-m do if (T[i], T[i+1], ..., T[i+m-1]) = (P[1], P[2], ..., P[m]) then print i running time: O(mn)
Knuth-Morris-Pratt String Matching • main idea: reuse the work, after a failure fail ! precompute on P reuse !
Knuth-Morris-Pratt String Matching • The Prefix-Function:[q] = the largest k < q s.t.(P[1], P[2], ..., P[k-1]) = (P[q-k+1], P[q-k+2], ..., P[q-1])
[8] = 2 [7] = 1 [6] = 4 [5] = 3 [3] = [2] = [1] = 1 [4] = 2
Knuth-Morris-Pratt String Matching /* compute */ . . . . /* do the matching */ q = 0; /* q = where we are in P */ for i = 1 to n do { q = q+1; while (q > 1 and P[q] != T[i]) q = [q]; if (P[q] = T[i]) { if (q=m) print(i – m+1); q = q+1; } } Time = O(n) (why ?)
Knuth-Morris-Pratt String Matching /* compute */ [1] = 0; for q = 2 to m+1 do { k = [q – 1]; while (k > 1 and P[k – 1] != P[q – 1]) k = [k]; if (k> 1 and P[k – 1] = P[q – 1]) then k = k+1; [q] = k; } /* do the matching */ . . . Time = O(m) (why ?) Total running time of KMP algoritm: O(m+n)
Final Review • Basic math • logs, exponents, summations • proof by induction • asymptotic analysis • big-oh, theta, omega • how to estimate running times • need sums • need recurrences
Final Review • Lists, stacks queues • ADT definition • Array, v.s. pointer implementation • variations: headers, doubly linked, etc • Trees: • definitions/terminology (root, parent, child, etc) • relationship between depth and size of a tree • depth is between O(log N) and O(N)
Final Review • Binary Search Trees • basic implementations of find, insert, delete • worst case performance: O(N) • average case performance: O(log N) (inserts only) • AVL trees • balance factor +1, 0, -1 • known single and double rotations to keep it balanced • all operations are O(log N) worst case time • Splay trees • good amortized performance • single operation may take O(N) • know the zig-zig, zig-zag, etc • B-trees: know basic idea behind insert/delete
Final Review • Priority Queues • binary heaps: insert/deleteMin, percolate up/down • array implementation • buildheap takes only O(N) !! Used in HeapSort • Binomial queues • merge is fast: O(log N) • insert, deleteMin are based on merge
Final Review • Hashing • hash functions based on the mod function • collision resolution strategies • chaining, linear and quadratic probing, double hashing • load factor of a hash table
Final Review • Sorting • elementary sorting algorithm: bubble sort, selection sort, insertion sort • heapsort O(N log N) • mergesort O(N log N) • quicksort O(N log N) average • fastest in practice, but O(N2) worst case performance • pivot selection – median of the three works well • known which of these are stable and in-place • lower bound on sorting • bucket sort, radix sort • external memory sort
Final Review • Disjoint sets and Union-Find • up-trees and their array-based implementation • know how union-by-size and path compression work • know the running time (not the proof)
Final Review • graph algorithms • adjacency matrix v.s. adjacency list representation • topological sort in O(n+m) time using a queue • Breadth-First-Search (BFS) for unweighted shortest path • Dijkstra’s shortest path algorithm • DFS • minimum spanning trees: Prim, Kruskal
Final Review • Graph algorithms (cont’d) • Euler v.s. Hamiltonian circuits • Know what P, NP and NP-completeness mean
Final Review • Algorithm design techniques • greedy: bin packing • divide and conquer • solving various types of recurrence relations for T(N) • dynamic programming (memoization) • DP-Fibonacci • Ordering matrix multiplication • randomized data structures • treaps • primality testing • string matching • Backtracking and game trees
The Final • Details: • covers chapters 1-10, 12.5, and some extra material • closed book, closed notes except: • you may bring one sheet of notes • time: 1 hour and 50 minutes • Monday, 3/17/2003, 2:30 – 4:20, this room • bring pens/pencils/etc • sleep well the night before
What About Friday ? • I will cover some of the problems on the website • I will take your questions