260 likes | 354 Views
Talk at the Inaugural International Conference on Distributed Event-Based Systems (DEBS 2007) The Arbitrary Boolean Publish/Subscribe Model: Making the Case. Sven Bittner and Annika Hinze , 22 June 2006. Motivation. Assumption: Applications require Boolean semantics. Question:
E N D
Talk at the Inaugural International Conference on Distributed Event-Based Systems (DEBS 2007)The Arbitrary Boolean Publish/Subscribe Model: Making the Case Sven Bittner and Annika Hinze, 22 June 2006
Motivation Assumption: Applications require Boolean semantics. Question: Why do pub/sub systems typically only support conjunctive subscriptions and advertisements? Answer: General ones can be converted to DNF.
Motivation: Conversion Consequences of conversion: + Individual subscription and advertisement less complex –Exponential number of subscriptions and advertisements Is there an equivalent of conversion? • Queries, e.g., in DBMSs, are converted to DNF • Really equivalent?
Interaction Semantics View Subscriptions Event messages Advertisements System input System output Captures semantics from user perspective Notifications Publish/subscribe system Queries Data update/ insert/delete Schema/access privileges Query results Database manage- ment system
Both transient, both in canonical form Both stored “as is” Data Storage View Event messages Notifications Stored data Transient data Subscription/ advertisement base Captures data storage from system perspective Advertisements Subscriptions Queries Query results Data update/ insert/delete Schema/ access privileges Subscription/ advertisement base
Motivation: Implications 1. Event messages already in canonical form, attribute-value pairs (as queries after conversion in DBMS) 2. Subscriptions should be stored as is (as data in DBMS) 3. Conversion increases problem size Pub/sub systems should not convert subscriptions.
Structure of Talk • Motivation • BoP System • Filtering • Event Routing • Advertisements • Experiments • Summary & Future Work Annika Hinze - The Arbitrary Boolean Publish/Subscribe Model: Making the Case
BoP System (Boolean Pub/Sub) • Event messages as attribute-value pairs • Support of general Boolean subscriptions • Operators “and”, “or”, and “not” • Any number of predicates per attribute • Support of general Boolean advertisements • Definition as subscriptions Motivation BoP System Experiments Summary
AND AND AND title like “ Harry Potter ” OR OR OR endingWithin endingWithin < 1 day < 1 day AND AND AND AND AND AND price < 10.0 price < 10.0 price < 10.0 condition = NEW condition = NEW condition = NEW price < 15.0 price < 15.0 price < 15.0 condition = USED condition = USED condition = USED Filtering in BoP: Subscriptions • Subscription tree (similar to syntax tree) • Inner nodes: operators • Leaf nodes: predicate identifiers • Negation removal • Shifting down into leaf nodes • Encoding • Straightforward storage of operators and predicate identifiers Motivation BoP System Experiments Summary
Filtering in BoP: Algorithm • Extension of counting approach, 3 steps [BH05] • Predicate matching • One-dimensional predicate indexes • Candidatesubscription matching • Counting of fulfilled predicates • Comparison to minimal number of required predicates • Realsubscription matching • Evaluation of tree structure • Optimizations (cf. paper) Motivation BoP System Experiments Summary
Routing in BoP • Usage of subscription forwarding or advertisement forwarding • Event routing optimization: Subscription pruning Motivation BoP System Experiments Summary
Subscription Pruning: Idea • Broadening of subscriptions by removing branches of subscription tree [BH06a] Pruning Pruning Pruning Motivation BoP System Experiments Summary
Subscription Pruning: Selection • Ordering of pruningoperations [BH06b] • Accuracy • Estimated changed in selectivity of subscription • Efficiency • Intertwined with filter algorithm (minimal predicate number) • Size • Reduction in size of tree • Accuracy and popularity • Selectivity and usage of predicates • Goal: keep common parts of subscriptions Motivation BoP System Experiments Summary
Advertisements in BoP • Overlap calculation • Based on disjoint predicates • Three algorithm steps • Disjoint predicate matching • Candidateoverlapping matching • Realoverlapping matching • Subscription routing optimization • Advertisement pruning [BH06c] • Order based on estimatedincrease in overlap (minimize additional subscription forwarding) Motivation BoP System Experiments Summary
Structure of Talk • Motivation • BoP System • Filtering • Event Routing • Advertisements • Experiments • Summary & Future Work Annika Hinze - The Arbitrary Boolean Publish/Subscribe Model: Making the Case
Experiments • Setup • Onlineauction scenario • Distributions of messages based on analysis of eBay • Characteristic classes of subscriptions and advertisement • Classes determine tree structure • Random population of operands in predicates • Results: real system analysis for five brokers (pruning of independent of network) MotivationBoP System Experiments Summary
Both linear with advantageous cache usage for small subscription numbers (increasing counters) Experiments: Local Filtering Predicate distributions: u – uniform, z – Zipf, n – normal, rz – reversed Zipf, rn – reversed normal MotivationBoP System Experiments Summary
Experiments: Distributed Filtering • Analysis of • Boolean filtering algorithm • Subscription pruning vs. • Counting filtering algorithm • Subscription covering • Covering proportion = • Proportion of removed non-local routing entries the higher, the more covering • Created by varying attribute domain sizes MotivationBoP System Experiments Summary
Boolean approach more time efficient for non-extreme covering proportions (left on abscissa) Experiments: Distributed Filtering MotivationBoP System Experiments Summary
Boolean approach more space efficient for non-extreme covering proportions (left on abscissa) Experiments: Distributed Filtering MotivationBoP System Experiments Summary
Structure of Talk • Motivation • BoP System • Filtering • Event Routing • Advertisements • Experiments • Summary & Future Work Annika Hinze - The Arbitrary Boolean Publish/Subscribe Model: Making the Case
Summary • Motivation • Data storage view: subscriptions conform to data, and should not be converted • BoP • Support general Booleansubscriptions and advertisements • Solutions for filtering, event routing optimization, overlap calculation, subscription routing optimization MotivationBoP SystemExperiments Summary
Conclusion • Experiments show that • Boolean subscriptions worthwhile with respect to • Central filtering (more time and space efficient) • Distributed filtering for non-extreme covering proportions (more time and space efficient) • Similar for advertisements (cf. paper) MotivationBoP SystemExperiments Summary
Future Work • Integration of low-level filtering algorithm optimizations • Analysis of further application areas • Integration of routing and filtering algorithm Optimization on more fine-grained basis MotivationBoP SystemExperiments Summary
Thank you for your attention! Contact: Annika Hinze a.hinze@cs.waikato.ac.nz
References [BH05] S. Bittner and A. Hinze. On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems. In Proceedings of the 4th International Workshop on Distributed Event-Based Systems (DEBS '05), pp. 451–457, Columbus, USA, 2005. [BH06a] S. Bittner and A. Hinze. Pruning Subscriptions in Distributed Publish/Subscribe Systems. In Proceedings of the Twenty-Ninth Australasian Computer Science Conference (ACSC 2006), pp. 197–206, Hobart, Australia, 2006. [BH06b] S. Bittner and A. Hinze. Dimension-Based Subscription Pruning for Publish/Subscribe Systems. In Proceedings of the 5th International Workshop on Distributed Event-Based Systems (DEBS '06), Lisbon, Portugal, 2006. [BH06c] S. Bittner and A. Hinze. Optimizing Pub/Sub Systems by Advertisement Pruning. In Proceedings of the 8th International Symposium on Distributed Objects and Applications (DOA 2006), pp. 1503–1521, Montpellier, France, 2006.