1 / 14

Design and Analysis of Algorithms Hash Tables

Design and Analysis of Algorithms Hash Tables. Haidong Xue Fall 2013, at GSU. Dictionary operations. Very likely. Worst case. INSERT DELETE SEARCH. ( 1 ). O(1). O(1). O(1). ( n ). O(1).

Download Presentation

Design and Analysis of Algorithms Hash Tables

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design and Analysis of Algorithms Hash Tables HaidongXue Fall 2013, at GSU

  2. Dictionary operations Very likely Worst case • INSERT • DELETE • SEARCH (1) O(1) O(1) O(1) (n) O(1) “A hash table is an effective data structure for implementing dictionaries” – textbook page 253

  3. Direct-address tables Direct-addressing: use keys as addresses 2 3 6 1 7 5 1 4 5 7 8 9 10 2 3 6 Direct-address table: What’s the problem here? SEARCH(S, 6) O(1) Storage requirement = , is the universe of keys INSERT(S, ) 4 O(1) When the range of element is in [1, 30000]….. DELETE(S, ) 7 O(1)

  4. Hash tables • Can we have O(1) INSERT, DELETE AND SEARCH with less storage? Yes! 2 2 3 3 6 6 1 1 7 7 5 5 h(2) = 2 mod 3 = 2 h(3) = 3 mod 3 = 0 0 1 2 Hash Table: h(6) = 6 mod 3 = 0 Collision! h(1) = 1 mod 3 = 1 Multiple elements in one slot h(7) = 7 mod 3 = 1 h(5) = 5 mod 3 = 2 Hash Function: h(x) = x mod 3

  5. Hash tables A common method is to put them into a linked-list, i.e. chaining 0 1 2 Hash Table: What is the upper bound length? What is the average length? 3 1 2 5 6 7 SEARCH(S, 6) h(6)=6 mod 3=0 SEARCH in 0-linked-list O(1)+2 (2 is the length of the linked-list) INSERT(S, ) 4 h(4)=4 mod 3=1 INSERT in 1-linked-list O(1)+O(1) = O(1) DELETE(S, ) 7 O(1)+O(1) = O(1) DELETE in 1-linked-list h(7)=7 mod 3=1

  6. Analysis of hash tables …….. Load factor n m 3 4 0 1 2 m-1 …….. Hash Table: Uniform hashing “each key is equally likely to hash to any of the m slots” … … … … … …

  7. Analysis of hash tables With the assumption of uniform hashing 3 4 0 1 2 m-1 …….. Therorem11.1 Unsuccessful search: (1+) Therorem11.2 Successful search: … … … … … … (1+) =, T(n)=(1+) If =, T(n)=(1+O(m))=O(1) How to get uniform hashing?

  8. Hash functions How to get uniform hashing? Uniform hashing “each key is equally likely to hash to any of the m slots” To achieve this goal, many hashing methods are proposed: • Division hashing • Multiplication hashing • Universal hashing

  9. Hash functions – division hashing • h(k) = k mod m where k is value of key, m is the number of slots • E.g.: • Final grades of all my students with a hash table of 10 slots • Items in grocery stores with a hash table of 10 slots • 99 cents, large soda • $1.99, ground beef • $6.99, lamb What’s the problem here? What if we still use 10 slots?

  10. Hash functions – division hashing • h(k) = k mod m • Choose m as a prime number • 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73,… • it sometimes not very convenient to be implemented () e.g.: 99 mod 7 = 1 199 mod 7 = 3 699 mod 7 = 6 What’s the problem here?

  11. Hash functions – multiplication hashing • h(k) = floor(m(kA mod 1)) where m is the number of slots and A is a constant number in (0, 1) • E.g.: A=0.123, m=10 • 99*0.123=12.177 • 199*0.123=24.477 • 699*0.123= 85.977 h(99)=floor(10*0.177)=1 h(199)=floor(10*0.477)=4 h(699)=floor(10*0.977)=9

  12. Hash functions – universal hashing • is set of hash functions; • At the beginning of each execution, randomly choose a hash function from • Universal: where, and are keys, is the number of slots • If is not in the table, • If is in the table, Theorem 11.3

  13. Another method to deal with collisions: Open Address • No linked-list • Hash functions include probe number: • Linear probing: • Quadratic probing: • Double hashing: • When does not work, use Number of probes for unsuccessful search is at most Number of probes for successful search is at most

  14. Another method to deal with collisions: Open Address 2 2 3 3 6 6 1 1 0 3 4 6 7 8 9 1 2 5 Open addressing: h(2, 0)=((2 mod 3) +0)mod 10=2 h(3, 0)=((3 mod 3) +0)mod 10=0 h(6, 0)=((6 mod 3) +0)mod 10=0 h(6, 1)=((6 mod 3) +1)mod 10=1 h(1, 0)=((1 mod 3) +0)mod 10=1 h(1, 1)=((1 mod 3) +1)mod 10=2 h(1, 2)=((1 mod 3) +2)mod 10=3

More Related