170 likes | 177 Views
Podcast Ch21b. Title : Collision Resolution Description : Linear probling; chaining with lists; table expansion Participants : Barry Kurtz (instructor); John Helfert and Tobie Williams (students) Textbook : Data Structures for Java; William H. Ford and William R. Topp. Designing Hash Tables.
E N D
Podcast Ch21b • Title: Collision Resolution • Description: Linear probling; chaining with lists; table expansion • Participants: Barry Kurtz (instructor); John Helfert and Tobie Williams (students) • Textbook: Data Structures for Java; William H. Ford and William R. Topp
Designing Hash Tables • When two or more data items hash to the same table index, they cannot occupy the same position in the table. • We are left with the option of locating one of the items at another position in the table (linear probing) or of redesigning the table to store a sequence of colliding keys at each index (chaining with separate lists) .
Linear Probing • The hash table is an array of elements with an associated hash function. Toadd an item • Initially, tag each entry in the table as "empty". • Apply the hash function to the key and divide the value by the table size to obtain a table index. If the entry is empty, insert the item. • Otherwise, start at the next hash index and scan successive indices, wrapping around to the start of the table after probing the last table entry. An insertion occurs at the first open location.
Linear Probing (continued) • The search returns to the original hash location without finding an open slot, the table is full, and the linear probing algorithm throws an exception. tableIndex = x % 11
Linear Probing (continued) // compute hash index of item for a table of size n int index = (item.hashCode()&Integer.MAX_VALUE)%n int origIndex; origIndex = index; // save the original hash index // cycle through the table looking for an empty slot, // a match or a table full condition do {// test whether the table slot is empty or the // key matches the data field of the table entry if table[index] is empty insert item in table at table[index] & return else if table[index] matches item then return // begin a probe starting at next table location index = (index+1) % n; } while (index != origIndex); // we have gone around table without finding match // or open slot throw new BufferOverflowException();
Linear Probing (concluded) • If the size of the table is large relative to the number of items, linear probing works well, because a good hash function generates indices that are evenly distributed over the table range, and collisions will be minimal. As the ratio of table size to the number of items approaches 1, the algorithm deteriorates to the sequential search.
Use the integer hash function hf(x) = x and the % 11 operator to store the integer values from array intArr in a hash table of size 11 using the open probe method. Assume that the elements are added to the table in the same order they are defined in the array. int[] intArr = {5, 19, 43, 38, 63, 96, 45, 65} • Display the elements in the following table. (b) Which element(s) require the largest number of probes to locate it in the table? ___ 96 ____ __ 65 _________ (c) Which element(s) can be accessed with a single probe? ___ 5 _____ __ 19 ____ ___ 43 ____ (d) What is the average number of probes for linear probing. __ 16 / 8 = 2 ________________
The open probe method suffers from the phenomenon known as (a) fatal collisions (b) sparse distribution (c) clustering (d) broken chains
Chaining with Separate Lists • Chaining with separate lists defines the hash table as an indexed sequence of linked lists. Each list, called a bucket, holds a set of items that hash to the same table location.
Chaining with Separate Lists (continued) • A bucket is a singly linked list. Each entry of the array is the first node in a sequence of items that hash to the table index. A node has the familiar structure with two fields, one for the value and one for the reference to the next node.
Chaining with Separate Lists (continued) • To add object item, use the hash function to identify the index of the appropriate bucket in the array (table). • If table[i] is null, add item as the first entry in the list. • Otherwise begin with the first node, entry = table[i], and compare item with entry.nodeValue. If there is no match, continue the scan with node entry.next, and so forth. If item is not in the list, add it to the front of the list.
Chaining with Separate Lists Consider the following sequence of eight elements {54, 77, 94, 89, 14, 45, 35, 76} with the identity hash function and tableSize = 11. The figure displays the lists. Each entry in a table includes the number of probes to add the element.
Use the integer hash function hf(x) = x and table size 11. Using chaining with separate lists, show the location in the hash table for each integer value in the following array. int[] intArr = {5, 19, 43, 38, 63, 96, 45, 65} 0 1 2 3 4 5 6 7 8 9 10
Keys 203, 426, and 561 hash to 5 Keys 987 and 316 hash to 3 Key 736, 97 hashes to 2 Key 124 hashes to 0 Assume that insertions are done in order {987, 203, 736, 316, 426, 561, 97, 124} Insert the position of the data if chaining with m = 7 is used to resolve collisions.
Chaining with Separate Lists • Chaining with separate lists is generally faster than linear probing since chaining only searches items that hash to the same table location. • With linear probing, the number of table entries is limited to the table size, whereas the linked lists used in chaining grow as necessary. • To delete an element, just erase it from the associated list.
Rehashing • As the number of entries in the hashtable increases, search performance deteriorates. Rehashing increases the hash table size when the number of entries in the table is a specified percentage of its size.
With rehashing, the table size is increased from size m to n. How are the elements copied from the original table to the new table? Elements are scanned in the original table and then rehashed with the new table size into the new table.