1 / 15

Hashing

Hashing. Main ch. 11.2-11.5 Background: We want to store a collection of data. We want to add to, delete from, and search in the collection What is the average case complexity of add, delete, and search if: The collection is stored as an unsorted array

darena
Download Presentation

Hashing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hashing • Main ch. 11.2-11.5 • Background: • We want to store a collection of data. We want to add to, delete from, and search in the collection • What is the average case complexity of add, delete, and search if: • The collection is stored as an unsorted array • The collection is stored as a sorted array • The collection is in a binary search tree

  2. We Want to Do Better • Hashing has good average case behavior • Suppose • You want to keep track of students via their student ID’s • If students ID’s range from 0-99 this is easy • What if SS type numbers are used? • This type of data is known as sparse data • The SS# becomes a key for obtaining the data

  3. Suppose • We want to keep track of only a small number of students in an array of size 10, and suppose we use SS#’s as keys • We need a function that maps key values (SS #’s) to array indices (integers between 0 and 9) • Such a function is called a hash function. • An example hash function could be: hash(ssn) = ssn%10

  4. Choosing a HashingFunction • For the previous example, we could have used: • hash(ssn) = the first number in the ssn • We want a hash function that uniformly distributes the keys throughout the array. This is called uniform hashing. • If you use a division hash function (remainder of division), it is best to have a table size that is a prime number of the form 4k+3. • see Main, p. 552 for other kinds of hash functions

  5. What could go wrong? • If possible, store an object with key value key in array[hash(key)]. • This is not always possible: you may want to add an object whose key value hashes to an index that’s already in use. • This is called a collision. • What is the big-oh time complexity of a hash table lookup (search) if there are no collisions?

  6. Handlihng Collisions • Linear probing • Place the object in the next open spot • How would you find an object in a hash table that uses linear probing? • How would you delete an object from a hash table that uses linear probing? • See the example in Main p. 550-551

  7. Linear probing • Performance isn’t all that great • Easy to implement • As the hash table gets fuller, larger and larger consecutive stretches of the array will be filled. This is called clustering.

  8. Double Hashing • If there is a collision, hash the key again, using a second hash function. • Double hashing is also called rehashing

  9. Chained Hashing • Each element in the array can hold a list of elements. • Hash the key and put the object in the list in array[hash(key)] • See the demo at: http://www2.ics.hawaii.edu/~richardy/project/hash/applet.html

  10. A Hash Function forNames private int hashFunction( String name ) { int hashValue = 0; char cName = name.toCharArray(); for (int j=0; j < cName.length; j++) { hashValue += cName[j]; } return hashValue % size; } Note that size is previously defined as the size of the hash table.

  11. Time Analysis • The load factor of a hash table is defined as follows:

  12. Searching with LinearProbing • In a non-full hash-table with no removals, and using uniform hashing, the average number of table elements examined in a successful search is approximately:

  13. Searching withDouble Hashing • In a non-full hash-table with no removals, and using uniform hashing, the average number of table elements examined in a successful search is approximately:

  14. Searching withChained Hashing • In a non-full hash-table, using uniform hashing, the average number of table elements examined in a successful search is approximately:

  15. Ave. # of Elements Examined During a Search(Main p. 561)

More Related