1 / 9

CS 361 – Chapter 6

CS 361 – Chapter 6. Desired operations in Dictionary ADT Hash tables Hash code computation Handling collisions Chaining Open addressing Linear probing Quadratic probing Double hashing. Dictionary ADT.

vincentbeck
Download Presentation

CS 361 – Chapter 6

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 361 – Chapter 6 • Desired operations in Dictionary ADT • Hash tables • Hash code computation • Handling collisions • Chaining • Open addressing • Linear probing • Quadratic probing • Double hashing

  2. Dictionary ADT • Each item in some aggregation is assigned a key value. Look up the item by means of the key. • Sounds like an array, except the key value can be anything convenient for us, rather than restricting us to indices 0,1,2,… • Desired operations • search (key) • insert (item, key) • remove (key) • Finding and removing could fail if the key value is not found in the dictionary.

  3. Implementation • Simple approach: ArrayList of (element, key) pairs. Called a “log file” d.s. • How would we implement the operations? • Inserting O(1) • Finding / removing O(n) • We would hope there’s a lot of inserting, to make this d.s. worthwhile! • More efficient approach: hash table • Array of “buckets” • Hash function to assign element to a bucket

  4. Goal of hash table • A faster array • We hope that most of the time, insert / search / delete can be done in constant time • Two issues • Requires more space • In worst case, generally O(n) complexity. We don’t want this to happen often!

  5. Implementation • Hash code: In a collection of objects, it’s desirable to assign each object a unique number. • Mathematically determined from its key. • There are good and bad ways to compute hash codes. We’d like these codes to be unique. • Compression: Since the hash code may be a big number, scale it down by performing a “mod” operation. • The result is the array index to insert / find / remove. • Collision: Sometimes a 3rd step is needed, in case 2 items map to the same bucket.

  6. Hashing example • Many objects have composite values, as in a string, list or several attributes per object. • Give them numerical values (e.g. ASCII code) and combine (a0, a1, a2, … an–1) into a hash code. • We could add them all up: hashCode = 0 for i = 0 to n-1 hashCode += a[i] • When would this be a good / bad hash function?

  7. Example 2 • To ensure more unique hash codes, we can use a polynomial approach. hash code = a0 c0 + a1 c1 + a2 c2 + … + an–1 c n–1 where c is some constant e.g. 7 • To avoid computing powers of c, we can rewrite the formula: a0 + c(a1 + c(a2 + c(a3 + … c(an – 1)))…) hashCode = a[n-1] for i = n-2 down to 0 hashCode = c * hashCode + a[i]

  8. Collisions • Insert, but cell already taken! • Search, but a different element lives here! • Various ways to handle collision, such as: • Chaining: maintain a list at each bucket. HashSet does this. • Open addressing: look for another “open” cell. • Practice with Q 4-7 on page 215. • A hash table must be larger than # elements anticipated • We can set up a specific “load factor” of 0.75. If the ratio of elements to max size exceeds this factor, allocate a bigger hash table. • Design issues can be resolved with experimentation on your collection of data.

  9. Techniques Key value k, hash value h(k): • Chaining • store a linked list at a[h(k)] • Linear probing • Try h(k)+1, h(k)+2, h(k)+3, h(k)+4, … • Quadratic probing • Try h(k)+1, h(k)+4, h(k)+9, h(k)+16, … • Double hashing • Also uses 2nd hash function h’(k) • Try h(k)+h’(k), h(k)+2h’(k), h(k)+3h’(k), …

More Related