220 likes | 334 Views
Talk at the 4th International Workshop on Distributed Event-Based Systems at the Conference ICDCS 2005 On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems. Sven Bittner and Annika Hinze , 10 June 2005. Structure. Motivation Canonical Transformation
E N D
Talk at the 4th International Workshop on Distributed Event-Based Systems at the Conference ICDCS 2005 On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems Sven Bittner and Annika Hinze, 10 June 2005
Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Summary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems
Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Summary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems
Motivation: Current Assumptions • Expressive filtering • All subscriptions might be transformed to DNFs (or are purely conjunctive) • Efficient filtering • Utilisation of indexes • Filtering on conjunctions (DNFs) is most efficient • Main memory solutions are most efficient • Scalable filtering • Filtering is obtained on designated servers • Large main memories are available Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary
Motivation: Our Point of View • Main memory algorithms are as scalable as provided resources Efficiency is only one quality measure Matching algorithms should consider their memory usage (scalability) • Claim • Filtering on arbitrary Boolean subscriptions is • More expressive (i.e., richer subscription language) • More scalable (i.e., requires less memory) • Only slightly less efficient (i.e., slower matching times) Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary
Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Summary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems
Transformations: Example Transformation Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary
Transformations: Implications • Efficiency • Faster filtering algorithms applicable • Filtering on more subscriptions, common sub-expressions • Scalability • Storage of Boolean formulae not required • More subscriptions to store Which influences overweigh? Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary
Transformations: Origin - DBMS • Utilised for query execution • Transform to canonical expression (e.g. DNF) • Simplify each element in disjunction separately • Create access plans and execute cheapest Useful, since efficient data access is crucial • Several advantages • Only few queries at one time no memory problems • Data storage is known in advance data access might be optimised Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary
Transformations: Why in ENS? • ENSs show converse problem definition • Large subscription numbers (queries) • Events not known in advance (data) • Subscriptions are not optimised (in current approaches) • Memory usage even higher • Computations for more subscriptions Is a transformation useful in ENSs? Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary
Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Summary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems
Non-Canonical Filtering: Trees • (almost) as shown: • Internal representation • Predicate identifiers in leaves (indexes for predicates) • Space efficient encoding (in future) • Actually encoded on byte level, i.e., • 1 byte each: No. of children, operator • 2 bytes: width of children MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary
Non-Canonical Filtering: 2 Steps • Predicate matching • Determine matching predicates • Subscription matching • Determine candidate subscriptions (min 1 match) • Evaluate their Boolean combinations MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary
Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Summary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems
AND OR OR OR p1 p2 p3 p4 p5 p6 Experiments: Initial Evaluation • Comparison of Step 2 of matching approaches • Step 1 utilises same indexes • Canonical counting algorithm (count no. of predicates) • Original – compare for all subscriptions • Variant – compare for candidate subscriptions only • Our non-canonical approach • Subscription characterisation • Number of predicates P (=6) • DNF consists of disjunctive elements (8) • Each element contains predicates (3) MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary
Experiments: Setting P M MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary
Counting algorithm Sharp bends denote when available main memory resources are exhausted The less subscriptions are created the better the scalability Non-canonical approach Available main memory sufficient Scalability independent of transformations P=6; M=5,000 P=10; M=5,000 Experiments: Results - Scalability MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary
Counting algorithm Original approach shows linear increasing matching times Variant becomes more efficient in case of large subscription numbers Non-canonical approach Filtering more efficient than variant of counting algorithm Difference becomes more pronounced when DNFs become larger P=6; M=5,000 P=10; M=5,000 Experiments: Results - Efficiency MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary
Experiments: Results - Summary • Transformations to DNFs radically drop scalability of filter algorithms Memory requirements for transformed conjunctive subscriptions overweigh storage space for Boolean ones • Filtering on several conjunctive subscriptions instead of arbitrary Boolean ones decreases efficiency Impact of more conjunctive (simpler) subscriptions on filtering performance overweighs higher matching costs of Boolean ones MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary
Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems
Future Work • Theoretical evaluation of memory requirements • Characterisation of subscriptions • Statements like “when to use which approach” • Further practical experiments • Prove correctness of theoretical evaluation • Analyse more sophisticated settings MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary
Thank you for your attention! Contact: Sven Bittner, Annika Hinze {s.bittner, a.hinze}@cs.waikato.ac.nz