180 likes | 200 Views
This study explores enabling networked applications to balance selected false positives against false negatives using retouched Bloom filters. It presents techniques like randomized bit clearing and selective clearing to manage troublesome false positives. Case studies on network measurement and resource routing are included. The findings indicate a neutral trade-off between false positives and false negatives with improved flexibility.
E N D
Retouched Bloom Filters:Allowing Networked Applications to Trade Off Selected False Positives Against False Negatives • Benoit Donnet • Joint work with Bruno Baynat and Timur Friedman • CoNEXT 2006, Lisbon 1
Context • Bloom filters are introduced in 1970 ([Bloom]) • Set membership problem • Trade-off between space and computing complexity • Lossy summary technique • Historical usage • Spell checking ([McIlroy]) • Database ([Bratbergsengen]) • Networking usage • See ([Broder & Mitzenmacher])
Contributions • Removing false positives at the expense of generating false negatives • Retouched Bloom filters • Randomized bit clearing • Selective clearing • Case study
Motivations • Some false positives might be more troublesome than others • Network measurement • Peer-2-peer, overlay • Resource routing • Network packet processing
Motivations (3) • An application can tolerate a low level of false negatives • Can we trade-off the most troublesome false positives for some randomly false negatives? • Retouched Bloom filters
Randomized bit clearing • Quid if we randomly reset bits in the vector? • False positives or not • Randomized bit clearing • Resetting s bits in the vector to 0 • Eliminates the same proportion of false positives as the proportion of false negatives generated
Selective Clearing • Focus on troublesome false positives • Four algorithms • Random Selection • Minimum FN Selection • Maximum FP Selection • Ratio Selection
Case study • Route tracing with a red stop set (RSS) of penultimate nodes • RSS implementation • List • Bloom filter • RBF • Skitter data from Jan. 2006 • Ten monitors • 10,000 destinations
Conclusion • Retouched Bloom filters • Flexibility • The trade-off between false positives and false negatives is, at worst, neutral • Selective clearing • Case study