1 / 25

The Variable-Increment Counting Bloom Filter

The Variable-Increment Counting Bloom Filter. Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion , Israel. Problem Definition. Support queries of the form Requirements for data structure: Space efficient Fast (Insertion, Query). Flow x. Flow y. Flow z.

morton
Download Presentation

The Variable-Increment Counting Bloom Filter

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Variable-Increment Counting Bloom Filter Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel

  2. Problem Definition • Support queries of the form • Requirements for data structure: • Space efficient • Fast (Insertion, Query) Flow x Flow y Flow z Flow u Flow y Yes No Set S (Special Flows) Flow y

  3. Naïve Solutions • O(n) – Searching in a list • O(log(n)) – Searching in a sorted list • O(1) ? • Tradeoff: We allow False Positives with low probability • Two possible errors • False Positives - but the answer is • False Negatives - but the answer is Flow x Flow y Flow y Flow z Set S (Special Flows)

  4. Bloom Filters (Bloom, 1970) • Initialization: Array of zero bits. • Insertion: Each of the elements is hashed times, the corresponding bits are set. • Query: Hashing the element, checking that all bits are set. • False positive rate (probability) of . • No false negatives. 0 0 0 0 0 0 0 0 0 0 0 0 y x 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 z w x

  5. Counting Bloom Filters (CBFs) • Bloom filters do not support deletions of elements. Simply resetting bits might cause false negatives. • The solution: Counting Bloom filters - Storing array of counters instead of bits. • Insertion: Incrementing counters by one. • Deletion: Decrementing counters by one. • Query: Checking that counters are positive. • The same false positive probability. • Require too much memory, e.g. 57 bits per element for . y x 1 1 1 1 1 1 0 1 0 1 0 0 0 0 0 0 0 0 y x +1 +1 +1 +1 +1 +1 0 1 0 1 0 0 2 0 1 0 1 0

  6. (Counting) Bloom Filters are Widely Used • Packet Classification • Intrusion Detection • Routing • Accounting • Beyond networking: Spell Checking, DNA Classification • Can be found in • Google's web browser Chrome • Google's database system BigTable • Facebook's distributed storage system Cassandra • Mellanox's IB Switch System

  7. Outline Introduction to Bloom Filters The Variable-Increment Counting Bloom Filter Intuition for Variable Increments The Bh-CBF Scheme The VI-CBF Scheme Experimental Results Summary

  8. Intuition for Variable Increments • Upon query, we should consider the exact values of the counters and not just their positiveness. • Idea: Use variable increments to encode the element identity. 0 1 0 2 4 0 1 7 0 1 2 1 y x 8

  9. Architecture • Each hash entry contains a pair of counters: • , fixed increments → number of elements in entry (as in CBF) • , variable increments → weighted sum of elements • weights from a pre-determined set • We use two sets of hash functions: • The first set uses hash functions with range • , i.e. it points to the set of entries. • The second set uses hash functions with range , i.e. it points to the set . 1 2 3 4 5 6 7 8 9 0 5 3 2 2 3 3 3 4 c1 c2 0 34 25 26 17 21 9 6 26 9

  10. Insertion • Insertion: • At each entry , the two counters are updated as follows. • from the set • Example 1: 1 2 3 4 5 6 7 8 9 0 5 3 2 2 3 3 3 4 34 01 34 c1 4 5 3034 2529 c2 0 34 25 17 17 21 9 13 26 3043 08 +8 +4 +13 +4 x z 10

  11. Query 1 2 3 4 5 6 7 8 9 0 5 3 2 3 3 4 3 4 c1 c2 0 34 25 17 30 21 30 13 26 • Query ( with ) • Weask whether • 17 can be a sum of 2 elements from the set including 4 • 30 can be a sum of 3 elements from the set including 8 • No: • How should we pick the set of variable increments? Flow y 4? 8? y? • We should use Sequences! 11

  12. Bh Sequences • Definition 1: • Let be a sequence of positive integers. • Then, is a sequence iff all the sums • with are distinct. • Example 2: • All the sums of elements of are distinct: • Therefore, is a sequence. • sequencesare widely used in error-correcting codes.

  13. The Bh-CBF Scheme Query 1 2 3 4 5 6 7 8 9 0 5 3 2 3 3 4 3 4 c1 c2 0 34 25 17 30 21 30 13 26 1? X? • Example 3: is a sequence • Since , then the Bh-CBF can determine that 4? 13

  14. The Bh-CBF Scheme Operations The Bh-CBF Scheme Query 1 2 3 4 5 6 7 8 9 0 5 3 2 3 3 4 3 4 c1 c2 0 34 25 17 30 21 30 13 26 1? X? • Example 3: is a sequence • Here, and then necessarily • Since , the Bh-CBF can determine that 4? 4? 8? y? 13

  15. The Bh-CBF Scheme Operations The Bh-CBF Scheme Query 1 2 3 4 5 6 7 8 9 0 5 3 2 3 3 4 3 4 c1 c2 0 34 25 17 30 21 30 13 26 1? X? • Example 3: is a sequence • Since , the Bh-CBF cannot exclude that 4? 13? 4? 4? 8? z? y? 13

  16. Outline Introduction to Bloom Filters The Variable-Increment Counting Bloom Filter Intuition for Variable Increments The Bh-CBF Scheme The VI-CBF Scheme Experimental Results Summary

  17. The VI-CBF Scheme Principles • Two counters in each hash entry  use more space. • Can we only keep the variable increment counter? • In the VI-CBF (Variable-Increment Counting Bloom Filter), each hash entry only contains the variable-increment counter. • The counter is updated like thevariable-increment counter in the • Bh-CBF. 1 2 3 4 5 6 7 8 9 0 5 3 2 2 3 3 3 4 c1 c2 0 34 25 26 17 21 9 6 26 15

  18. The VI-CBF Scheme Principles 1 2 3 4 5 6 7 8 9 0 5 3 2 3 3 4 3 4 c1 c2 0 34 25 17 30 21 30 13 26 • cannot be a sum of 3 elements from the set including 8 • However, can be a sum of 5 elements from the set including 8 • Problem: We do not know the number of elements in each hash entry. • Example 4: (with the sequence ) 4? 8? y? 16

  19. The VI-CBF Scheme Principles • In the VI-CBF , the set of variable increments is not necessarily a sequence • Example 5: • Based on or , the VI-CBF can deduce that y x +7 +5 +4 +5 +5 +4 7 9 4 5 5 7 5 6 z 0 0 0 0 0 0 0 0 0 0 0 0 17

  20. A Simple Option for D:DL = [L, 2L-1] • For , we define the set of size as • Intuition: • Lemma 1: • Let be an element whose -th hash function hashes into an • entry of the value If then • sum of • zero elements • sum of • one element • sum of • two or more elements • not • possible • not • possible 18

  21. VI-CBF Outperforms CBF • Theorem 1: • While keeping the same bit-per-element ratio , VI-CBF satisfies • the following properties when compared to CBF: (i) VI-CBF obtains a lower false positive rate than CBF. (ii) (iii) VI-CBF obtains a lower counter overflow probability bound than the classical bound of CBF. • Cost: Limited implementation overhead. 19

  22. Outline Introduction to Bloom Filters The Variable-Increment Counting Bloom Filter Intuition for Variable Increments The Bh-CBF Scheme The VI-CBF Scheme Experimental Results Summary

  23. Experimental Results • Internet trace (equinix-chicago)with real hash functions. • For the Bh-CBF, (with ). • For the VI-CBF, and . . 21

  24. Concluding Remarks • Encoding the element identity using Variable Increments • Considering the exact values of the counters upon query • Can extend many variants of the counting Bloom filter • First time sequences are presented in networking applications 22

  25. Thank You

More Related