260 likes | 701 Views
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java. Chapter 5: Hashing . Hash Tables. Lydia Sinapova, Simpson College. H as hing. What is Hashing? Direct Access Tables Hash Tables. Hashing - basic idea.
E N D
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Chapter 5: Hashing Hash Tables Lydia Sinapova, Simpson College
Hashing • What is Hashing? • Direct Access Tables • Hash Tables
Hashing - basic idea • A mapping between the search keys and indices - efficient searching into an array. • Each element is found with one operation only.
Hashing example Example: • 1000 students, • identification number between 0 and 999, • use an array of 1000 elements. • SSN of each student a 9-digit number. • much more elements than the number of the students, - a great waste of space.
The Approach • Directly referencing records in a table usingarithmetic operations on keysto map them onto table addresses. • Hash function: function that transforms thesearch key into a table address.
Direct-address tables • The most elementary form of hashing. • Assumption– direct one-to-one correspondence between the keys and numbers 0, 1, …, m-1., m – not very large. • Array A[m]. Each position (slot) in the array contains a pointer to a record, or NIL.
Direct-Address Tables • The most elementary form of hashing • Assumption –direct one-to-one correspondence between the keys and numbers 0, 1, …, m-1., m – not very large. • Array A[m].Each position (slot) in the array contains either a reference to a record, or NULL. • Cost –the size of the array we need is determined the largest key. Not very useful if there are only a few keys
Hash Functions Transform the keys into numbers within a predetermined interval to be used as indices in an array (table, hash table) to store the records
Hash Functions – Numerical Keys Keys – numbers If M is the size of the array, then h(key) = key % M. This will map all the keys into numbers within the interval [0 .. (M-1)].
Hash Functions – Character Keys Keys – strings of characters Treat the binary representation of a key as a number, and then apply the hash function
How keys are treated as numbers • If each character is represented withmbits, • then the string can be treated asbase-mnumber.
Example A K E Y : 00001 01011 00101 11001 = 1 . 323 + 11 . 322 + 5 . 321 + 25 . 320 = 44271 Each letter is represented by itsposition in the alphabet. E.G,Kis the 11-th letter, and its representation is01011( 11 in decimal)
Long Keys If the keys are very long, an overflow may occur. A solutionto this is to apply theHorner’s method in computing the hash function.
Horner’s Method anxn + an-1.xn-1 + an-2.xn-2 + … + a1x1 + a0x0 = x(x(…x(x (an.x +an-1) + an-2 ) + …. ) + a1) + a0 4x5 + 2x4 + 3x3 + x2 + 7x1 + 9x0 = x( x( x( x ( 4.x +2) + 3) + 1 ) + 7) + 9 The polynomial can be computed by alternating the multiplication and addition operations
Example V E R Y L O N G K E Y 10110 00101 10010 11001 01100 01111 01110 00111 01011 00101 11001 V E R Y L O N G K E Y 22 5 18 25 12 15 14 7 11 5 25 22 . 3210 + 5 . 329 + 18 . 328 + 25 . 327+ 12 . 326 + 15 . 325 + 14 . 324 + 7 . 323 + 11 . 322 + 5 . 321 + 25 . 320
Example (continued) V E R Y L O N G K E Y 22 . 3210 + 5 . 329 + 18 . 328 + 25 . 327 + 12 . 326 + 15 . 325 + 14 . 324 + 7 . 323 + 11 . 322 + 5 . 321 + 25 . 320 (((((((((22.32 + 5)32 + 18)32 + 25)32 + 12)32 + 15)32 + 14)32 + 7)32 +11)32 +5)32 + 25 compute the hash function by applying the mod operation at each step, thus avoiding overflowing. h0 = (22.32 + 5) % M h1 = (32.h0 + 18) % M h2 = (32.h1 +25) % M . . . .
Code int hash32 (char[] name, int tbl_size) { key_length = name.length; int h = 0; for (int i=0; i < key_length ; i++) h = (32 * h + name[i]) % tbl_size; return h; }
Hash Tables • Index - integer, generated by a hash function between 0 and M-1 • Initially - blank slots. • sentinel value, or a special field in each slot.
Hash Tables • Insert- hash function to generate an address • Searchfor a key in the table - the same hash function is used.
Size of the Table Table size M- different from the number of recordsN Load factor:N/M M must be primeto ensure even distribution of keys