1 / 27

ITCS 6114

ITCS 6114. Universal Hashing Dynamic Order Statistics. Choosing A Hash Function. Choosing the hash function well is crucial Bad hash function puts all elements in same slot A good hash function: Should distribute keys uniformly into slots Should not depend on patterns in the data

lenk
Download Presentation

ITCS 6114

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ITCS 6114 Universal Hashing Dynamic Order Statistics David Luebke 11/5/2020

  2. Choosing A Hash Function • Choosing the hash function well is crucial • Bad hash function puts all elements in same slot • A good hash function: • Should distribute keys uniformly into slots • Should not depend on patterns in the data • We discussed three methods: • Division method • Multiplication method • Universal hashing David Luebke 21/5/2020

  3. Universal Hashing • When attempting to foil an malicious adversary, randomize the algorithm • Universal hashing: pick a hash function randomly when the algorithm begins (not upon every insert!) • Guarantees good performance on average, no matter what keys adversary chooses • Need a family of hash functions to choose from David Luebke 31/5/2020

  4. Universal Hashing • A family of hash functions  is said to be universal if: • With a random hash function from , the chance of a collision between x and y is exactly 1/m (x  y) • We can use this to get good expected performance: • Choose h from a universal family of hash functions • Hash n keys into a table of m slots, nm • Then the expected number of collisions involving a particular key x is less than 1 David Luebke 41/5/2020

  5. A Universal Hash Function • Choose table size m to be prime • Decompose key x into r+1 bytes, so that x = {x0, x1, …, xr} • Only requirement is that max value of byte < m • Let a = {a0, a1, …, ar} denote a sequence of r+1 elements chosen randomly from {0, 1, …, m - 1} • Define corresponding hash function ha : • With this definition,  has mr+1 members David Luebke 51/5/2020

  6. A Universal Hash Function •  is a universal collection of hash functions (Theorem 12.4) • How to use: • Pick r based on m and the range of keys in U • Pick a hash function by (randomly) picking the a’s • Use that hash function on all keys David Luebke 61/5/2020

  7. M8 C5 P2 Q1 A1 F3 D1 H1 Order Statistic Trees • OS Trees augment red-black trees: • Associate a size field with each node in the tree • x->size records the size of subtree rooted at x, including x itself: David Luebke 71/5/2020

  8. M8 C5 P2 Q1 A1 F3 D1 H1 Selection On OS Trees How can we use this property to select the ith element of the set? David Luebke 81/5/2020

  9. OS-Select OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 91/5/2020

  10. M8 C5 P2 Q1 A1 F3 D1 H1 OS-Select Example • Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 101/5/2020

  11. M8 C5 P2 Q1 A1 F3 D1 H1 i = 5r = 6 OS-Select Example • Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 111/5/2020

  12. M8 C5 P2 Q1 A1 F3 D1 H1 i = 5r = 6 OS-Select Example • Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5r = 2 David Luebke 121/5/2020

  13. M8 C5 P2 Q1 A1 F3 D1 H1 i = 5r = 6 OS-Select Example • Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5r = 2 i = 3r = 2 David Luebke 131/5/2020

  14. M8 C5 P2 Q1 A1 F3 D1 H1 i = 5r = 6 OS-Select Example • Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5r = 2 i = 3r = 2 i = 1r = 1 David Luebke 141/5/2020

  15. Oops… OS-Select: A Subtlety OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } • What happens at the leaves? • How can we deal elegantly with this? David Luebke 151/5/2020

  16. OS-Select OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } • What will be the running time? David Luebke 161/5/2020

  17. M8 C5 P2 Q1 A1 F3 D1 H1 Determining The Rank Of An Element What is the rank of this element? David Luebke 171/5/2020

  18. M8 C5 P2 Q1 A1 F3 D1 H1 Determining The Rank Of An Element Of this one? Why? David Luebke 181/5/2020

  19. M8 C5 P2 Q1 A1 F3 D1 H1 Determining The Rank Of An Element Of the root? What’s the pattern here? David Luebke 191/5/2020

  20. M8 C5 P2 Q1 A1 F3 D1 H1 Determining The Rank Of An Element What about the rank of this element? David Luebke 201/5/2020

  21. M8 C5 P2 Q1 A1 F3 D1 H1 Determining The Rank Of An Element This one? What’s the pattern here? David Luebke 211/5/2020

  22. OS-Rank OS-Rank(T, x) { r = x->left->size + 1; y = x; while (y != T->root) if (y == y->p->right) r = r + y->p->left->size + 1; y = y->p; return r; } • What will be the running time? David Luebke 221/5/2020

  23. OS-Trees: Maintaining Sizes • So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time • Next step: maintain sizes during Insert() and Delete() operations • How would we adjust the size fields during insertion on a plain binary search tree? David Luebke 231/5/2020

  24. OS-Trees: Maintaining Sizes • So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time • Next step: maintain sizes during Insert() and Delete() operations • How would we adjust the size fields during insertion on a plain binary search tree? • A: increment sizes of nodes traversed during search David Luebke 241/5/2020

  25. OS-Trees: Maintaining Sizes • So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time • Next step: maintain sizes during Insert() and Delete() operations • How would we adjust the size fields during insertion on a plain binary search tree? • A: increment sizes of nodes traversed during search • Why won’t this work on red-black trees? David Luebke 251/5/2020

  26. Maintaining Size Through Rotation • Salient point: rotation invalidates only x and y • Can recalculate their sizes in constant time • Why? y19 x19 rightRotate(y) x11 y12 7 6 leftRotate(x) 6 4 4 7 David Luebke 261/5/2020

  27. Augmenting Data Structures: Methodology • Choose underlying data structure • E.g., red-black trees • Determine additional information to maintain • E.g., subtree sizes • Verify that information can be maintained for operations that modify the structure • E.g., Insert(), Delete() (don’t forget rotations!) • Develop new operations • E.g., OS-Rank(), OS-Select() David Luebke 271/5/2020

More Related