1 / 23

Data Structures and Algorithms Hashing First Year

Data Structures and Algorithms Hashing First Year. M. B. Fayek CUFE 2009. Hashing. What is Hashing? Problems in hashing Collision Resolution Strategies. 1. What is Hashing?. Hashing is a quick and efficient searching technique .

wenda
Download Presentation

Data Structures and Algorithms Hashing First Year

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Structures and Algorithms HashingFirst Year M. B. Fayek CUFE 2009

  2. Hashing • What is Hashing? • Problems in hashing • Collision Resolution Strategies

  3. 1. What is Hashing? • Hashing is a quick and efficient searching technique. • Sofar, efficiency of search depended on the number of comparisons • In hashing the keys themselves point directly to records by applying a hashing function. • All possible key values are mapped into in the hash table. • The hashing function is used for search as well as for storing.

  4. 1. What is Hashing? • The hash table is sequential and contiguous. • Each slot is called a bucket. • Buckets may hold more than one key.

  5. 1. What is Hashing? • Hashing methods: • Direct and Subtraction • Modulo-division (or division remainder) using list size ( prime, why?) • Digit extraction • Midsquare • Folding ( fold shift, fold boundary) • Pseudo random ( seed)

  6. Hashing • What is Hashing? • Problems in hashing • Collision Resolution Strategies

  7. Problems in Hashing • Collision occurs whenever a hash function maps two distinct keys to the same bucket. • The hashing function must generate bucket addresses quickly and efficiently, with minimum collisions. • As the domain of keys is usually larger than the number of buckets collisions are very likely to happen no matter how efficient the hashing function is.

  8. Hashing • What is Hashing? • Problems in hashing • Collision Resolution Strategies

  9. 3. Collision Resolution Strategies • Definitions: • Load factor = list size/num of elements in list • Clustering ( primary, secondary)

  10. 3. Collision Resolution Strategies • Open Addressing: (using prime area) • Probing (Linear, quadratic) • Double Hashing • Pseudo-random • Key offset • Linked Lists (Separate Chaining) • (Bucket Hashing) • Re-hashing

  11. 3. Collision Resolution Strategies • Open Addressing: • Probing: • Linear Probing: Search at constant intervals from collision (typically 1) • Quadratic Probing: Search at quad-ratically increasing intervals, i.e. collision function f(i) = i2 ; i.e. on collision searching 1st, 4th, 9th, … location

  12. Linear Probing

  13. 3. Collision Resolution Strategies • Open Addressing: (using prime area) • Probing (Linear, quadratic) • Double Hashing • Pseudo-random • Key offset • Linked Lists (Separate Chaining) • (Bucket Hashing) • Re-hashing

  14. 3. Collision Resolution Strategies • Open Addressing • Double Hashing: Apply a second hashing function and probe at the obtained address: hash2(x), 2* hash2(x), 3* hash2(x), . . .

  15. 3. Collision Resolution Strategies • Open Addressing: (using prime area) • Probing (Linear, quadratic) • Double Hashing • Pseudo-random • Key offset • Linked Lists (Separate Chaining) • (Bucket Hashing) • Re-hashing

  16. 3. Collision Resolution Strategies • Linked lists (Separate Chaining): • Separate chaining ( may be modified by keeping the chain sorted!) • Modified Hash Table (by eliminating the first probe, hence the hash table becomes an array of records instead of an array of pointers to records)

  17. Linked List (Separate Chaining)

  18. 3. Collision Resolution Strategies • Open Addressing: (using prime area) • Probing (Linear, quadratic) • Double Hashing • Pseudo-random • Key offset • Linked Lists (Separate Chaining) • (Bucket Hashing) • Re-hashing

  19. successful search unsuccessful search 3. Collision Resolution Strategies • Rehashing: • When table becomes too full, operations will start taking too long • Solution: Build another hashing table of about double size + associated hashing function and scan down entire original hash table

  20. 3. Collision Resolution Strategies • Rehashing: • When is the table too full ? • Rehash when table is half full • Rehash when an insertion fails • When table reaches a certain load factor . . . . . best

  21. End of Hashing

  22. Probing • Definition: Each calculation of an address and test for success is known as probing

  23. Key offset collision resolution • Offset = key/list size • Address= (Offset + old address) % list size

More Related