130 likes | 537 Views
Hash Tables. Hash tables. Definition: A data structure that uses a hash function to map keys into index of an array element. k1. k5. k2. k3. k4. Some properties of hash table. Size of hash table (Example will be shown.)
E N D
Hash tables • Definition: A data structure that uses a hash function to map keys into index of an array element. k1 k5 k2 k3 k4
Some properties of hash table • Size of hash table (Example will be shown.) • Hash function: map keys into index of an array element. (To be continued…) • Multiplication Hash • Division Hash • Input to build a hash table: array of keys to store in the hash table • int [] input = {1,2,3,4,5,6,7,8} 1 2 / 3 4 / 5 6 / 7 8 /
Example • Hash table size is 10 20 110 / 103 13 53 / 10 69 /
Division Hash • h1(k) = k mod m • Returns the index of array • k is the key • m is the size of the hash table. • Good values of m: prime numbers smaller than and closest to the size of the input. See Table 1. • Java syntax of mod is %. Table 1.
Multiplication hash • h2(k) = floor(m (kA mod 1) ) • m is size of hash table • Good values of m: prime numbers smaller than and closest to the size of the input. See table 1. • k is key • A = 0.61803 (Came from (sqrt(5) - 1)/2 ) • Hints: Use the decimal in your program is better, it may reduce your bugs.
Collisions • When hashing a key, if collision happens the new key is stored in the linked list in that location • Number of collisions of a location = Number of elements in that location - 1 # of collisions = 2-1=1 20 110 / # of collisions = 3-1=2 103 13 53 /
Collision Metrics • maxCollisions: Maximum number of collisions of all locations in a hash table • minCollisions: Minimum number of collisions of all locations in a hash table • totalCollisions: Total collisions of all locations in a hash table • Examples on the next slide
maxCollisions = 2 • minCollisions = 1 • (** Note that the minCollisions will be at least 1 if there exists collisions in some locations, even if there are locations with 0 collisions. If there is no collisions at all, return 0. ) • totalCollisions = 4 # of collisions = 1 20 110 / # of collisions = 2 103 13 53 / 105 15 / # of collisions = 1
Discussion • Why metrics? • It can tell us which hash is better according to the collision metrics • Why 3 metrics, why not just measure totalCollisions? • Let’s see an example.
Which hash table is better? 20 110 / 103 13 / Hash table 1: totalCollisions = 4 103 13 / 103 13 / 20 110 103 13 103 / 13 / 103 / 13 / Hash table 2: totalCollisions = 4
We not only want less collisions, but also want to distribute the collisions evenly into the hash table. That is why hash table 1 is better than hash table 2. • This lab is to implement two hash functions, division and multiplication and use metrics of collisions to demonstrate which hash is better.