520 likes | 883 Views
Universal Hashing. When attempting to foil an malicious adversary, randomize the algorithm Universal hashing : pick a hash function randomly when the algorithm begins Guarantees good performance on average, no matter what keys adversary chooses Need a family of hash functions to choose from
E N D
Universal Hashing • When attempting to foil an malicious adversary, randomize the algorithm • Universal hashing: pick a hash function randomly when the algorithm begins • Guarantees good performance on average, no matter what keys adversary chooses • Need a family of hash functions to choose from • Think of quick-sort
Universal Hashing • Let Gbe a (finite) collection of hash functions • …that map a given universe U of keys… • …into the range {0, 1, …, m - 1}. • Gis said to be universal if: • for each pair of distinct keys x, y U,the number of hash functions h Gfor which h(x) = h(y) is |G|/m • In other words: • With a random hash function from G the chance of a collision between x and y is exactly 1/m (x y)
Universal Hashing • Theorem 11.3: • Choose h from a universal family of hash functions • Hash n keys into a table of m slots, nm • Then the expected number of collisions involving a particular key x is less than 1 • Proof: • For each pair of keys y, z, let cyx= 1 if y and z collide, 0 otherwise • E[cyz] = 1/m (by definition) • Let Cx be total number of collisions involving key x • Since nm, we have E[Cx] < 1
A Universal Hash Function • Choose table size m to be prime • Decompose key x into r+1 bytes, so that x = {x0, x1, …, xr} • Only requirement is that max value of byte < m • Let a = {a0, a1, …, ar} denote a sequence of r+1 elements chosen randomly from {0, 1, …, m - 1} • Define corresponding hash function ha G: • With this definition, Ghas mr+1 members
A Universal Hash Function • Gis a universal collection of hash functions (Theorem 11.5) • How to use: • Pick r based on m and the range of keys in U • Pick a hash function by (randomly) picking the a’s • Use that hash function on all keys
Example • Let m = 5, and the size of each string is 2 bits (binary). Note the maximum value of a string is 3 and m = 5 • a = 1,3, chosen at random from 0,1,2,3,4 • Example for x = 4 = 01,00 (note r = 1) • ha(4) = 1 (01) + 3 (00) = 1
Open Addressing • Basic idea (details in Section 12.4): • To insert: if slot is full, try another slot, …, until an open slot is found (probing) • To search, follow same sequence of probes as would be used when inserting the element • If reach element with correct key, return it • If reach a NULL pointer, element is not in table • Good for fixed sets (adding but no deletion) • Table needn’t be much bigger than n
Lecture 11:Binary Search Trees Shang-Hua Teng
Keys Keys Entry Satellite data Data Format
Insertion and Deletion on dynamic sets • Insert(S,x) • A modifying operation that augments the set S with the element pointed by x • Delete • Given a pointer x to an element in the set S, removes x from S • Notice that this operation uses a pointer to an element x, not a key value
Querying on dynamic sets • Search(S,k) • given a set S and a key value k, returns a pointer x to an element in S such that key[x] = k, or NIL if no such element belongs S • Minimum(S) • on a totally ordered set S that returns a pointer to the element S with the smallest key • Maximum(S) • Successor(S,x) • Given an element x whose key is from a totally ordered set S, returns a pointer to the next larger element in S, or NIL if x is the maximum element • Predecessor(S,x)
Edge interior node path Node subtree leaf child Trees root Degree? parent Depth/Level? Height?
Parent Data Left Right root • Tree • root Binary Tree • Node • data • left child • right child • parent (optional)
Trees • Full tree of height h • all leaves present at level h • all interior nodes full • total number of nodes in a full binary tree? • Complete tree of height h
+ * - 5 3 8 4 Applications - Expression Trees To represent infix expressions (5*3)+(8-4)
Applications - Parse Trees Used in compilers to check syntax statement statement else statement if cond then statement if cond then
Binary Search Trees • Binary Search Trees (BSTs) are an important data structure for dynamic sets • In addition to satellite data, elements have: • key: an identifying field inducing a total ordering • left: pointer to a left child (may be NULL) • right: pointer to a right child (may be NULL) • p: pointer to a parent node (NULL for root)
F B H A D K Binary Search Trees • BST property: key[leftSubtree(x)] key[x] key[rightSubtree(x)] • Example:
Pre order visit the node go left go right In order go left visit the node go right Post order go left go right visit the node Level order / breadth first for d = 0 to height visit nodes at level d Traversals for a Binary Tree
A B C D E F G H I Traversal Examples Pre order A B D G H C E F I In order G D H B A E C F I Post order G H D B E I F C A Level order A B C D E F G H I
Traversal Implementation • recursive implementation of preorder • base case? • self reference • visit node • pre-order(left child) • pre-order(right child) • What changes need to be made for in-order, post-order?
Inorder Tree Walk • In order TreeWalk(x) TreeWalk(left[x]); print(x); TreeWalk(right[x]); • Prints elements in sorted (increasing) order
F B H A D K In order Tree Walk • Example: • How long will a tree walk take? • In order walk prints in monotonically increasing order. Why?
+ * - 5 3 8 4 Evaluating an expression tree • Walk the tree in postorder • When visiting a node, use the results of its children to evaluate it.
Operations on BSTs: Search • Given a key and a pointer to a node, returns an element with that key or NULL: TreeSearch(x, k) if (x = NULL or k = key[x]) return x; if (k < key[x]) return TreeSearch(left[x], k); else return TreeSearch(right[x], k);
F B H A D K BST Search: Example • Search for D and C:
Operations of BSTs: Insert • Adds an element x to the tree so that the binary search tree property continues to hold • The basic algorithm • Like the search procedure above • Insert x in place of NULL • Use a “trailing pointer” to keep track of where you came from (like inserting into singly linked list)
F B H A D K BST Insert: Example • Example: Insert C C
BST Search/Insert: Running Time • What is the running time of TreeSearch() or TreeInsert()? • O(h), where h = height of tree • What is the height of a binary search tree? • worst case: h = O(n) when tree is just a linear string of left or right children • We’ll keep all analysis in terms of h for now • Later we’ll see how to maintain h = O(lg n)
Sorting With Binary Search Trees • Informal code for sorting array A of length n: BSTSort(A) for i=1 to n TreeInsert(A[i]); InorderTreeWalk(root); • Argue that this is (n lg n) • What will be the running time in the • Worst case? • Average case? (hint: remind you of anything?)
3 1 8 2 6 5 7 Sorting With BSTs for i=1 to n TreeInsert(A[i]); InorderTreeWalk(root); • Average case analysis • It’s a form of quicksort! 3 1 8 2 6 7 5 1 2 8 6 7 5 2 6 7 5 5 7
Sorting with BSTs • Same partitions are done as with quicksort, but in a different order • In previous example • Everything was compared to 3 once • Then those items < 3 were compared to 1 once • Etc. • Same comparisons as quicksort, different order!
Sorting with BSTs • Since run time is proportional to the number of comparisons, same expected time as quicksort: O(n lg n) • Which do you think is better, quicksort or BSTsort? Why?
Sorting with BSTs • Since run time is proportional to the number of comparisons, same time as quicksort: O(n lg n) • Which do you think is better, quicksort or BSTSort? Why? • Answer: quicksort • Better constants • Sorts in place • Doesn’t need to build data structure
More BST Operations • A priority queue supports • Insert • Minimum • Extract-Min • BSTs are good for more than sorting. For example, can implement a priority queue
BST Operations: Minimum • How can we implement a Minimum() query? • What is the running time? • O(h)
BST Operations: Successor • For deletion, we will need a Successor() operation • What is the successor of node 3? Node 15? Node 13? • What are the general rules for finding the successor of node x? (hint: two cases)
BST Operations: Successor • Two cases: • x has a right subtree: successor is minimum node in right subtree • x has no right subtree: successor is first ancestor of x whose left child is also ancestor of x • Intuition: As long as you move to the left up the tree, you’re visiting smaller nodes. • Predecessor: similar algorithm
F Example: delete Kor H or B B H C A D K BST Operations: Delete • Deletion is a bit tricky • 3 cases: • x has no children: • Remove x • x has one child: • Splice out x • x has two children: • Swap x with successor • Perform case 1 or 2 to delete it
BST Operations: Delete • Why will case 2 always go to case 0 or case 1? • because when x has 2 children, its successor is the minimum in its right subtree • Could we swap x with predecessor instead of successor? • Of course
Next Lecture • Up next: guaranteeing an O(lg n) height tree
Animation • Animated Binary Tree