170 likes | 365 Views
11. Hash Tables. Heejin Park College of Information and Communications Hanyang University. Contents. Direct-address tables Hash tables Hash functions. Direct address tables. Generate a table T with | U | slots. Store k into slot k . Assume that no two elements have the same key.
E N D
11. Hash Tables Heejin Park College of Information and Communications Hanyang University
Contents • Direct-address tables • Hash tables • Hash functions
Direct address tables • Generate a table T with |U| slots. • Store k into slot k. • Assume that no two elements have the same key. T Satellite data Key U (universe of keys) 6 • 0 • 9• 7 • 4 • 1 • 2 K (actual keys) 3 5 8 Slot
Direct-address tables • Running time : O(1) DIRECT-ADDRESS-SEARCH(T, k) returnT[k] DIRECT-ADDRESS-INSERT(T, x) T[key[x]] ← x DIRECT-ADDRESS-DELETE(T, x) T[key[x]] ← NIL
Hash tables • Space consumption of direct addressing • Θ(|U|) • If the universe |K|/|U| is small, most of the space allocated for T would be wasted. • Is it possible to reduce the space requirement to Θ(|K|) while the running time is still O(1) ?
Hash tables • Hashing • The element with key k is stored in slot h(k). • instead of slot k. • h is called a hash function. • A hash functioncomputes the slot from the key k. • h : U → {0, 1, ..., m - 1}. • We say that an element with key khashes to slot h(k). • We also say that h(k) is the hash value of key k.
Hash tables 0 h(k1) h(k3) h(k2)=h(k5) h(k4) m-1 U (universe of keys) k1 K (actual keys) k3 k5 k2 k4
Hash tables • Collision : two keys may hash to the same slot. • Avoiding collisions • To make h appear to be random. • avoiding collisions or at least minimizing collisions. • Avoiding collisions is impossible because |U| > m.
Hash tables • Collision resolution by chaining • Put the elements hashed to the same slot in a linked list. U (universe of keys) k1 k8 K (actual keys) k5 k4 k3 k6 k2 k7
Hash tables CHAINED-HASH-INSERT(T, x) insert x at the head of list T[h(key[x])] CHAINED-HASH-DELETE(T, x) delete x from the list T[h(key[x])] CHAINED-HASH-SEARCH(T, k) search for an element with key k in list T[h(k)]
Hash tables • The worst-case running time for insertion is O(1). • Assuming that the element x being inserted is not already present in the table. • Deletion of an element x can be done in O(1) time. • Assuming that the lists are doubly linked. • For searching, the worst-case running time is proportional to the length of the list.
Hash tables • Searching time for hashing with chaining • Given a hash table T with m slots that stores n elements, we define the load factor α for T. • The average number of elements stored in a chain. • α = n/m.
Hash tables • Worst-case • All n keys hash to the same slot, creating a list of length n. • Searching time : Θ(n) plus the time to compute the hash function. • No better than a linked list for all the elements.
Hash tables • Average case • Depends on how well the hash function h distributes the set of keys to be stored among the m slots. • Simple uniform hashing • An element is equally likely to hash into each slot. • An element hashes independently of where any other element has hashed to.
Hash tables • Let nj denote the length of the list T[j] for j = 0, 1, ..., m – 1. • n = n0 + n1 + … + nm-1 • average value of nj • E[nj] = α = n/m. • Assume that the hash value h(k) can be computed in O(1) time, the time required to search for an element with key k depends linearly on the length nh(k)of the list T[h(k)].
Hash tables • Consider two cases of search. • When search is unsuccessful. • No element in the table has key k. • When search is successful. • An element with key k is found.
Hash tables An unsuccessful search takes Θ(1 + α) expected time. A successful search takes Θ(1 + α) expected time.