170 likes | 610 Views
Hashing. 0 1 2 3. Search Key. Black Box. n-3 n-2 n-1. A hash function must: Be relatively easy and quick to calculate. Evenly distribute the keys throughout the table. Be easily repeatable for Finds and Deletes. Deal with collisions. (Collision Resolutions).
E N D
0 1 2 3 . . . . Search Key Black Box n-3 n-2 n-1
A hash function must: • Be relatively easy and quick to calculate. • Evenly distribute the keys throughout the table. • Be easily repeatable for Finds and Deletes. • Deal with collisions. (Collision Resolutions) Black Box: The Hash Function
Selecting digits – For example, selecting the 2nd, 6th, and 9th digits from a SSN to arrive at an array element number. • Folding – Adding the digits of a number, like an SSN • Modulo Arithmetic – Modding the number with the table size: h(x) = x % 262 Hashing Methods
Convert each character to a number. • Assign arbitrary values to characters. • Using the character’s ASCII value. • Use some technique to arrive at an element number • Digit selection • Folding • Binary Concatenation Converting Strings to Integers
W E I N E R 23 5 9 14 5 18 Total = 23 * 325 + 5 * 324 + 9 * 323 + 14 * 322 + 5 * 321 + 18 * 320 Element = Total % Table Size Element = 777,304,242 % 401 Element = 228 Binary Concatenation
Open Addressing • Linear Probing • Quadratic Probing • Double Hashing • Increasing the size of the table • Restructuring the Hash Table • Buckets • Separate Chaining Collision Resolution
If the element calculated by the hash function is already occupied, add 1 and try again, then try adding 2, and so on. • Linear Probing causes clusters of full array elements and does not distribute items evenly. (Primary Clustering) Linear Probing
If element indicated is occupied, try adding 12, then 22, then 32, and so on. • Does not cause Primary Clustering, but does result in secondary clustering. • Secondary Clustering is not a big problem. Quadratic Probing
Use linear probing, but calculate the amount to be added by executing a second hash of the key. • Using a hash function of: h1(key) = key % <table size>a second hash might be: h2(key) = 7 – (key % 7) • Double hashing drastically reduces clustering. Double Hashing
Each element of the table array is an array, so you can have multiple keys with the same element number. • The problem is how big to make these element arrays. • Too small and they fill up quickly, only delaying the inevitable collisions. • Too big and they waste space and processing time. Using Buckets
A Linked List as the elements in the table array. • Each linked list grows as needed without wasting any space. • You do have to traverse the linked list on a Find or a Delete, which means you’re moving from O(1) to O(n). Separate Chaining
HashTableOA • - Object[ ] • boolean[ ] • + createHashTableOA(int, double) • + insert(String):void • + find(String):Object • + delete(String):boolean • hash(String):int • next prime(int):int HashTableSC - LinkedList[ ] + createHashTableSC(int, double) + insert(String):void + find(String):Object + delete(String):boolean - hash(String):int HashTable Design