180 likes | 314 Views
Bloom Filters. Burton Bloom (1970). Favorite Data Structure. Underutilized. Many Applications. Successful vs. unsuccessful search Quick fail method Checking file without accessing it. Basic idea Lo……ng Bit String 00010101010001010111010101010010000 n hash functions.
E N D
Bloom Filters • Burton Bloom (1970) • Favorite Data Structure • Underutilized • Many Applications
Successful vs. unsuccessful search • Quick fail method • Checking file without accessing it
Basic idea • Lo……ng Bit String 00010101010001010111010101010010000 • n hash functions
24-bit Example H1(x) = H2(x) = H3(x) = 000000000000000000000000 0 8 16 23
What are the effects of size of filter and number of hash functions?
m – number of bits in filter • k – number of records in file • α - % of records in file to total population Pset = 1/m Punset= 1 – 1/m
(1-1/m)n For n transformations (hash functions) Pn.unset = • For k records • Pnk.unset = (1-1/m)nk Pnk.set = 1 - Pnk.unset • Pnk.set= • (1- (1-1/m)nk)
Pallset = (Pnk.set)n Pallset= [1- (1-1/m)nk]n Pfalse.drop = (1 – α)Pallset
Table II (Ramakrishna, 1989) • hc,d(x) = ((cx + d) mod p) mod m, and H1 ={hc,d( ) | 0 < c < p, 0 ≤ d < p} 0 ≤ Key values ≤ p – 1 0 ≤ Hash values ≤ m - 1
How could Bloom Filters be used to eliminate duplicates? • How could Bloom Filters be used with signature hashing?
Additions? • Deletions? • Counting Bloom Filters
More Applications • Spell Checking • Distributed Databases • Web Page Caching • Peer-to-peer Networks • Increase Bandwidth in Cellular Networks
See A. Broder and M. Mitzenmacher, “Network Applications of Bloom Filters: A Survey,” in Fortieth Annual Allerton Conference on Communication, Control, and Computing, 2002.