270 likes | 390 Views
Complex Hashing & Chaining. std ::Hash Functors. C++11 STL includes hash functors Instantiate and use as function:. std ::Hash Functors. C++11 STL includes hash functors Instantiate and use as function: One line version:. Other Types. How do we hash: Point? Employee? BitmapImage ?.
E N D
std::Hash Functors • C++11 STL includes hashfunctors • Instantiate and use as function:
std::Hash Functors • C++11 STL includes hashfunctors • Instantiate and use as function: • One line version:
Other Types • How do we hash: • Point? • Employee? • BitmapImage?
Other Types • Cover as many bits as possible
Other Types • Cover as many bits as possible • Combine all values that vary • "John Smith" K100203 vs "John Smith" K923424
Other Types • Cover as many bits as possible • Combine all values that vary • "John Smith" K100203 vs "John Smith" K923424 • Try to make the lowest bits most random • 2013/05/28 day << 20 + month << 10 + yearyear << 20 + month << 10 + day
Bitwise XOR • Bitwise XOR : ^ • combines binary values, preserves entropy 0101 ^ 1111 = 1010 0101 ^ 0000 = 0101 0101 ^ 1011 = 0001
Other Types • Uses existing hash functions: • Combine with bitwise xor
Other Types • Use bit shifts to spread out values if needed
Hashing Danger • Person p1:"John Smith" • Say hash code forJohn Smith is17…
Hashing Danger • Person p1:"John Smith" • Say hash code forJohn Smith is17… p1.firstName = "Bob"
Hashing Danger • Person p1:"John Smith" • Say hash code forJohn Smith is17… p1.firstName = "Bob" hash(p1) just changedwon't find p1!
Hashing Danger • NEVER modify something being used as a hashed value in hash table!!! • Remove, modify, reinsertor • Use immutable values for hashing
Probing Review • Probing Issues: • Clusters • Extra work proportional to 1/(1-)
Chaining • Chaining (Closed Addressing) :Each bucket can hold multiple values
Chaining • Chaining (Closed Addressing) :Each bucket can hold multiple values • Implementation • Linked List • Holds a few/zero items efficiently • Time efficiency not a big concern
IntHashSet Storage = array of std::list
IntHashSet • Contains: • Find right linked list • Search it
IntHashSet • Insert: • If not there • Find right list and add value
IntHashSet • Remove: • Find right list • Look for item in list • If found, remove
Efficiency • Avg time proportional to load factor • O() = O()
Efficiency • Avg time proportional to load factor • O() = O() • If k is constant, technically O(n) • Massive constant divisor • If k grows proportionally with n = O(1)
Real World • Hash table grows when load factor too large • Cost of all ops O(1) • Insert is amortized O(1) • Cache use oftendetermining factor
But • No natural ordering
Ordered - O(1) • Space vs Time trade offs • Hybrid/Duplicative representations
HashMap • Map • Key/Value pairsJohn Smith521-1234 • HashMap • Identity determined by key • Only hash key • Value stored with key in table