CSE 326: Data Structures: Advanced Topics

CSE 326: Data Structures: Advanced Topics Lecture 26:Wednesday, March 12th, 2003

Today • Dynamic programming for ordering matrix multiplication • Very similar to Query Optimization in databases • String processing • Final review

Ordering Matrix Multiplication • Need to compute A  B  C  D    =

Ordering Matrix Multiplication • One solution: (A  B)  (C  D):  )(  ( )= =  Cost: (3  2  4) + (4  2  3) + (3  4  3) = 84

Ordering Matrix Multiplication • Anoter solution: (A  (B  C))  D: (  )) ( =  )  =... ( = Cost: (2  4  2) + (3  2  2) + (3  2  3) = 46

Ordering Matrix Multiplication Problem: • Given A1 A2 . . .  An, compute optimal ordering Solution: • Dynamic programming • Compute cost[i][j] • the minimum cost to compute Ai Ai+1 . . .  Aj • Proceed iteratively, increasing the gap = j – i

Ordering Matrix Multiplication /* initialize */ for i = 1 to n-1 do cost[i][i] = 0 /* why ? */ /* dynamic programming */ for gap = 1 to n do { for i = 1 to n – gap do { j = i + gap; c = ; for k = i to j-1 do /* how much would it cost to do (Ai . . .  Ak )  (Ak+1 . . .  Aj) ? */ c = min(c, cost[i][k] + cost[k+1][j] + A[i].rows * A[k].columns * A[j].columns) cost[i][j] = c; } } = A[k+1].rows

Ordering Matrix Multiplication • Running time: O(n3) Important variation: • Database systems do join reordering • A very similar algorithm • Come to CSE 544...

String Matching • The problem • Given a text T[1], T[2], ..., T[n]and a pattern P[1], P[2], ..., P[m] • Find all positions s such that P “occurs” in T at position s:(T[s], T[s+1], ..., T[s+m-1]) = (P[1], ..., P[m]) • Where do we need this ? • text editors (e.g. emacs) • grep • XML processing

String Matching • Example:

Naive String Matching /* initialize */ for i = 1 to n-m do if (T[i], T[i+1], ..., T[i+m-1]) = (P[1], P[2], ..., P[m]) then print i running time: O(mn)

Knuth-Morris-Pratt String Matching • main idea: reuse the work, after a failure fail ! precompute on P reuse !

Knuth-Morris-Pratt String Matching • The Prefix-Function:[q] = the largest k < q s.t.(P[1], P[2], ..., P[k-1]) = (P[q-k+1], P[q-k+2], ..., P[q-1])

[8] = 2 [7] = 1 [6] = 4 [5] = 3 [3] = [2] = [1] = 1 [4] = 2

Knuth-Morris-Pratt String Matching /* compute  */ . . . . /* do the matching */ q = 0; /* q = where we are in P */ for i = 1 to n do { q = q+1; while (q > 1 and P[q] != T[i]) q = [q]; if (P[q] = T[i]) { if (q=m) print(i – m+1); q = q+1; } } Time = O(n) (why ?)

Knuth-Morris-Pratt String Matching /* compute  */ [1] = 0; for q = 2 to m+1 do { k = [q – 1]; while (k > 1 and P[k – 1] != P[q – 1]) k = [k]; if (k> 1 and P[k – 1] = P[q – 1]) then k = k+1; [q] = k; } /* do the matching */ . . . Time = O(m) (why ?) Total running time of KMP algoritm: O(m+n)

Final Review • Basic math • logs, exponents, summations • proof by induction • asymptotic analysis • big-oh, theta, omega • how to estimate running times • need sums • need recurrences

Final Review • Lists, stacks queues • ADT definition • Array, v.s. pointer implementation • variations: headers, doubly linked, etc • Trees: • definitions/terminology (root, parent, child, etc) • relationship between depth and size of a tree • depth is between O(log N) and O(N)

Final Review • Binary Search Trees • basic implementations of find, insert, delete • worst case performance: O(N) • average case performance: O(log N) (inserts only) • AVL trees • balance factor +1, 0, -1 • known single and double rotations to keep it balanced • all operations are O(log N) worst case time • Splay trees • good amortized performance • single operation may take O(N) • know the zig-zig, zig-zag, etc • B-trees: know basic idea behind insert/delete

Final Review • Priority Queues • binary heaps: insert/deleteMin, percolate up/down • array implementation • buildheap takes only O(N) !! Used in HeapSort • Binomial queues • merge is fast: O(log N) • insert, deleteMin are based on merge

Final Review • Hashing • hash functions based on the mod function • collision resolution strategies • chaining, linear and quadratic probing, double hashing • load factor of a hash table

Final Review • Sorting • elementary sorting algorithm: bubble sort, selection sort, insertion sort • heapsort O(N log N) • mergesort O(N log N) • quicksort O(N log N) average • fastest in practice, but O(N2) worst case performance • pivot selection – median of the three works well • known which of these are stable and in-place • lower bound on sorting • bucket sort, radix sort • external memory sort

Final Review • Disjoint sets and Union-Find • up-trees and their array-based implementation • know how union-by-size and path compression work • know the running time (not the proof)

Final Review • graph algorithms • adjacency matrix v.s. adjacency list representation • topological sort in O(n+m) time using a queue • Breadth-First-Search (BFS) for unweighted shortest path • Dijkstra’s shortest path algorithm • DFS • minimum spanning trees: Prim, Kruskal

Final Review • Graph algorithms (cont’d) • Euler v.s. Hamiltonian circuits • Know what P, NP and NP-completeness mean

Final Review • Algorithm design techniques • greedy: bin packing • divide and conquer • solving various types of recurrence relations for T(N) • dynamic programming (memoization) • DP-Fibonacci • Ordering matrix multiplication • randomized data structures • treaps • primality testing • string matching • Backtracking and game trees

The Final • Details: • covers chapters 1-10, 12.5, and some extra material • closed book, closed notes except: • you may bring one sheet of notes • time: 1 hour and 50 minutes • Monday, 3/17/2003, 2:30 – 4:20, this room • bring pens/pencils/etc • sleep well the night before

What About Friday ? • I will cover some of the problems on the website • I will take your questions

CSE 326: Data Structures: Advanced Topics

CSE 326: Data Structures: Advanced Topics

Presentation Transcript

CSE 326: Data Structures Trees

CSE 326: Data Structures

CSE 326: Data Structures: Graphs

CSE 326: Data Structures: Graphs

CSE 326: Data Structures: Graphs

CSE 326: Data Structures Lists

CSE 326: Data Structures Network Flow

CSE 326: Data Structures Graph Traversals

CSE 326 Data Structures: Complexity

CSE 326: Data Structures: Sorting

CSE 326: Data Structures Part 10 Advanced Data Structures

CSE 326: Data Structures

CSE 326: Data Structures Lecture #23 Data Structures

CSE 326: Data Structures Part 10 Advanced Data Structures

CSE 326: Data Structures Trees

CSE 326: Data Structures Trees

CSE 326: Data Structures: Graphs

CSE 326: Data Structures: Sorting

CSE 326: Data Structures NP Completeness