1 / 27

ITCS 6114

ITCS 6114. Universal Hashing Dynamic Order Statistics. Choosing A Hash Function. Choosing the hash function well is crucial Bad hash function puts all elements in same slot A good hash function: Should distribute keys uniformly into slots Should not depend on patterns in the data

kyrene
Download Presentation

ITCS 6114

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ITCS 6114 Universal Hashing Dynamic Order Statistics David Luebke 16/3/2014

  2. Choosing A Hash Function • Choosing the hash function well is crucial • Bad hash function puts all elements in same slot • A good hash function: • Should distribute keys uniformly into slots • Should not depend on patterns in the data • We discussed three methods: • Division method • Multiplication method • Universal hashing David Luebke 26/3/2014

  3. Universal Hashing • When attempting to foil an malicious adversary, randomize the algorithm • Universal hashing: pick a hash function randomly when the algorithm begins (not upon every insert!) • Guarantees good performance on average, no matter what keys adversary chooses • Need a family of hash functions to choose from David Luebke 36/3/2014

  4. Universal Hashing • A family of hash functions  is said to be universal if: • With a random hash function from , the chance of a collision between x and y is exactly 1/m (x  y) • We can use this to get good expected performance: • Choose h from a universal family of hash functions • Hash n keys into a table of m slots, nm • Then the expected number of collisions involving a particular key x is less than 1 David Luebke 46/3/2014

  5. A Universal Hash Function • Choose table size m to be prime • Decompose key x into r+1 bytes, so that x = {x0, x1, …, xr} • Only requirement is that max value of byte < m • Let a = {a0, a1, …, ar} denote a sequence of r+1 elements chosen randomly from {0, 1, …, m - 1} • Define corresponding hash function ha : • With this definition,  has mr+1 members David Luebke 56/3/2014

  6. A Universal Hash Function •  is a universal collection of hash functions (Theorem 12.4) • How to use: • Pick r based on m and the range of keys in U • Pick a hash function by (randomly) picking the a’s • Use that hash function on all keys David Luebke 66/3/2014

  7. M8 C5 P2 Q1 A1 F3 D1 H1 Order Statistic Trees • OS Trees augment red-black trees: • Associate a size field with each node in the tree • x->size records the size of subtree rooted at x, including x itself: David Luebke 76/3/2014

  8. M8 C5 P2 Q1 A1 F3 D1 H1 Selection On OS Trees How can we use this property to select the ith element of the set? David Luebke 86/3/2014

  9. OS-Select OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 96/3/2014

  10. M8 C5 P2 Q1 A1 F3 D1 H1 OS-Select Example • Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 106/3/2014

  11. M8 C5 P2 Q1 A1 F3 D1 H1 i = 5r = 6 OS-Select Example • Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 116/3/2014

  12. M8 C5 P2 Q1 A1 F3 D1 H1 i = 5r = 6 OS-Select Example • Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5r = 2 David Luebke 126/3/2014

  13. M8 C5 P2 Q1 A1 F3 D1 H1 i = 5r = 6 OS-Select Example • Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5r = 2 i = 3r = 2 David Luebke 136/3/2014

  14. M8 C5 P2 Q1 A1 F3 D1 H1 i = 5r = 6 OS-Select Example • Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5r = 2 i = 3r = 2 i = 1r = 1 David Luebke 146/3/2014

  15. Oops… OS-Select: A Subtlety OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } • What happens at the leaves? • How can we deal elegantly with this? David Luebke 156/3/2014

  16. OS-Select OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } • What will be the running time? David Luebke 166/3/2014

  17. M8 C5 P2 Q1 A1 F3 D1 H1 Determining The Rank Of An Element What is the rank of this element? David Luebke 176/3/2014

  18. M8 C5 P2 Q1 A1 F3 D1 H1 Determining The Rank Of An Element Of this one? Why? David Luebke 186/3/2014

  19. M8 C5 P2 Q1 A1 F3 D1 H1 Determining The Rank Of An Element Of the root? What’s the pattern here? David Luebke 196/3/2014

  20. M8 C5 P2 Q1 A1 F3 D1 H1 Determining The Rank Of An Element What about the rank of this element? David Luebke 206/3/2014

  21. M8 C5 P2 Q1 A1 F3 D1 H1 Determining The Rank Of An Element This one? What’s the pattern here? David Luebke 216/3/2014

  22. OS-Rank OS-Rank(T, x) { r = x->left->size + 1; y = x; while (y != T->root) if (y == y->p->right) r = r + y->p->left->size + 1; y = y->p; return r; } • What will be the running time? David Luebke 226/3/2014

  23. OS-Trees: Maintaining Sizes • So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time • Next step: maintain sizes during Insert() and Delete() operations • How would we adjust the size fields during insertion on a plain binary search tree? David Luebke 236/3/2014

  24. OS-Trees: Maintaining Sizes • So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time • Next step: maintain sizes during Insert() and Delete() operations • How would we adjust the size fields during insertion on a plain binary search tree? • A: increment sizes of nodes traversed during search David Luebke 246/3/2014

  25. OS-Trees: Maintaining Sizes • So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time • Next step: maintain sizes during Insert() and Delete() operations • How would we adjust the size fields during insertion on a plain binary search tree? • A: increment sizes of nodes traversed during search • Why won’t this work on red-black trees? David Luebke 256/3/2014

  26. Maintaining Size Through Rotation • Salient point: rotation invalidates only x and y • Can recalculate their sizes in constant time • Why? y19 x19 rightRotate(y) x11 y12 7 6 leftRotate(x) 6 4 4 7 David Luebke 266/3/2014

  27. Augmenting Data Structures: Methodology • Choose underlying data structure • E.g., red-black trees • Determine additional information to maintain • E.g., subtree sizes • Verify that information can be maintained for operations that modify the structure • E.g., Insert(), Delete() (don’t forget rotations!) • Develop new operations • E.g., OS-Rank(), OS-Select() David Luebke 276/3/2014

More Related