Review for Final Exam

Review for Final Exam • Non-cumulative, covers material since exam 2 • Data structures covered: • Treaps • Skip lists • Hash tables • Disjoint sets • Graphs • For each of these data structures • Basic idea of data structure and operations • Be able to work out small example problems • Prove related theorems • Advantages and limitations • Asymptotic time performance • Comparison • Review questions are available on the web.

Treaps • Definition • Two values associated with each node • Key: making it a BST • Priority: making is binary min heap • Priorities are randomly generated • Making treap a BST constructed from a randomly ordered sequence of keys (why?) • Main advantages • High probability to be balanced (h = O(logn)) • Compare with splay tree and RB tree • Operations • Find: according to key values as if it is a BST • Insert: as a leaf first as in BST, then rotate it up to satisfy heap order • Delete: rotate the node to be deleted down according to heap order until it becomes a leaf, then delete it. • Support set union, partition

Skip Lists • What is a skip list • Nodes with different size (different # of skip pointers) • Node size distribution according to the associated probability p • Nodes with different size do not have to follow a rigid pattern • The expected # of nodes with exactly k pointers (pk-1(1- p)) • How to determine the size of the head node (log1/p N) • Why need skip lists • Expected time performance O(lg N) for find/insert/remove • Probabilistically determining node size facilitate insert/remove operations • Advantages over sorted arrays, sorted list, BST, balanced BST

Skip list operations • find • insert (how to determine the size of the new node) • Set pointers in insert and remove operations (backLook node) • Performance • Expected time performance O(lg N) for find/insert/remove (very small prob. of poor performance when N is large) • Expected # of pointers per node: 1/(1 - p)

Hashing • Hash table • Trading space for time • Table size (primes) • Hashing functions • Properties making a good hashing function • Examples of division and multiplication hashing functions • Operations (insert/remove/find/) • Collision management • Separate chaining • Open addressing (different probing techniques, clustering problem) • Worst case time performance: • O(1) for find/insert/delete if  is small and hashing function is good • Limitations • Hard to answer order based queries (successor, min/max, etc.)

Disjoint Sets • Equivalence relation and equivalence class • definitions and examples • Disjoint sets and up-tree representation • representative of each set • direction of pointers • Union-find operations • basic union and find operation • path compression (for find) and union by weight heuristics • time performance when the two heuristics are used: O(m lg* n) for m operations (what does lg* n mean) O(1) amortized time for each operation

Graphs • Graph definitions • G = (V, E), directed and undirected graphs, DAG • path, path length (with/without weights), cycle, simple path • connectivity, connected component, connected graph, complete graph, strongly and weakly connectedness. • Adjacency and representation • adjacency matrix and adjacency lists, when to use which • time performance with each • Graph traversal: DF and BF • Single source shortest path • Breadth first (with unweighted edges) • Dijkstra’s algorithm (with weighted edges) • Topological order (for DAG) • What is a topological order (definitions of predecessor, successor, strict partial order) • Algorithm for topological sort

Review for Final Exam