1 / 18

Look-up problem

Look-up problem. IP address. did we see the IP address before?. Hashing + chaining. use IP address as an index. linked list. index. hash function. IP address. How to choose a hash function?. depends on the distribution of the data. x  [0,1]

cecile
Download Presentation

Look-up problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Look-up problem IP address did we see the IP address before?

  2. Hashing + chaining use IP address as an index linked list index hash function IP address

  3. How to choose a hash function? depends on the distribution of the data x [0,1] x x.n interpret x as a number x  x mod n IP-addresses n=256 bad n=257 good?

  4. Universal hash functions choose a hash function “randomly” n = number of entries in the hash table U = the universe h: U  {0,...,n-1} a hash function

  5. Universal hash functions choose a hash function “randomly” n = number of entries in the hash table U = the universe h: U  {0,...,n-1} a hash function a set of hash functions H is universal if x,y U and random h  H P ( h(x) = h(y) )  1/n

  6. Universal hash functions a set of hash functions H is universal if x,y U and random h  H P ( h(x) = h(y) )  1/n For IP addresses choose a1,a2,a3,a4  {0,1,...,256} (x1,x2,x3,x4)  a1x1+a2x2+a3x3+a4x4 mod 257

  7. Perfect hashing Goal: worst-case O(1) search space used O(m) static set of elements

  8. Perfect hashing Goal: worst-case O(1) search space used O(m) static set of elements n = m2 i.e., space used (m2) H = family of universal hash functions hash function h H with no collision

  9. Perfect hashing Goal: worst-case O(1) search space used O(m) n = m H = family of universal hash functions x1,...,xn the number of elements that map to 1,2,...,n h H such that xi2 = O(m)

  10. h H such that xi2 = O(m) Perfect hashing Goal: worst-case O(1) search space used O(m) n = m H = family of universal hash functions x1,...,xn the number of elements that map to 1,2,...,n secondary hash table of size xi2

  11. Bloom filter Goal: store an m element subset of IP addresses IP address HASH HASH HASH 0 0 0 n-bits of storage

  12. Bloom filter - insert INSERT(x) for i from 1 to k do A(hi(x))  1 IP address HASH HASH HASH 1 1 1 n-bits of storage

  13. Bloom filter – member MEMBER(x) for i from 1 to k do if A(hi(x))=0 then return FALSE return TRUE IP address HASH HASH HASH 1 1 1 n-bits of storage

  14. Bloom filter – member MEMBER(x) for i from 1 to k do if A(hi(x))=0 then return FALSE return TRUE sometimes gives false positive answer error parameter: false positive probability

  15. Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions

  16. Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions p = fraction of the bits filled p  e-km/n

  17. Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions p  e-km/n p = fraction of the bits filled false positive probability (1-p)k

  18. Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions optimal k  0.7 m/n false positive rate  0.6185m/n

More Related