1 / 30

Sven Bittner and Annika Hinze , 18 January 2006

Talk at the 29 th Australasian Computer Science Conference (ACSC2006) Pruning Subscriptions in Distributed Publish/Subscribe Systems. Sven Bittner and Annika Hinze , 18 January 2006. pub(item,price, timeLeft,…). User. pub(item,...). Subscription. pub(item,price, timeLeft,…).

majed
Download Presentation

Sven Bittner and Annika Hinze , 18 January 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Talk at the 29th Australasian Computer Science Conference (ACSC2006)Pruning Subscriptions in Distributed Publish/Subscribe Systems Sven Bittner and Annika Hinze, 18 January 2006

  2. pub(item,price, timeLeft,…) User pub(item,...) Subscription pub(item,price, timeLeft,…) Notify about items of interest Motivation: Publish/Subscribe • Subscribers register subscriptions • Publishers send event messages • Systeminforms usingnotifications EBay Distributed Pub/Sub System TradeMe Filtering Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

  3. AND title like “Harry Potter” OR endingWithin < 1 day AND AND price < 10.0 condition = NEW price < 15.0 condition = USED Motivation:Subscription Example • A subscriber is interested in books whose title contains the phrase “Harry Potter”. • According to the condition of the copy of the book (new, used), she wants to pay at most NZ$10.0 or NZ$15.0. • To avoid unnecessary notifications, the subscriber will be notified not earlier than one day before the auction ends. Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

  4. Motivation: Problem Sizes • Online auctions • Subscriptions: > 106 (no. of users) • Events: > 20 / sec (new items and bids) • Notifications: not time-critical, but events must be processed permanently • Facility management • Subscriptions: > 50,000 (today’s systems) • Events: > 1,000 / sec (from sensors, switches) • Notifications: delay < 0.1 sec  Time and spaceefficiency required Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

  5. Structure • Motivation • Subscription Pruning • Selectivity Estimation • Evaluation of Subscription Pruning • Summary and Outlook Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

  6. Structure • Motivation • Subscription Pruning • Selectivity Estimation • Evaluation of Subscription Pruning • Summary and Outlook Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

  7. Current Optimizations • Work on conjunctive subscriptions only (−)  Restricted subscription language  Not applicable for general-purpose systems • Strong assumptions (−) • Similarities/relationships among subscriptions • Evaluations too simplistic for high-level applications  We cannot generalize evaluation results Motivation Subscription Pruning Selectivity Estimation Evaluation Outlook

  8. Subscription Generalization • Routing optimization for arbitrary Boolean subscriptions (+) • Optimizes subscriptions independently of each other (+)  Optimization potential regardless of individual and collective subscription structure  Favourable routing optimization for general- purpose pub/sub systems Motivation Subscription Pruning Selectivity Estimation Evaluation Outlook

  9. Generalization by Pruning • Goals of pruning • Remove parts of subscription tree of non-local subscribers • Create more general subscription  Less predicates to filter on (+)  Less complex subscriptions (+)  More events to filter (−)  More time and space efficient filtering Motivation Subscription Pruning Selectivity Estimation Evaluation Outlook

  10. Routing table Routing table Subscriber Un-optimized routing: Routing table Routing table Application of Pruning (1) • Forwarding of subscriptions for selective routing Motivation Subscription Pruning Selectivity Estimation Evaluation Outlook

  11. Routing table Routing table Routing table Routing table Application of Pruning (2) • Forwarding of subscriptions for selective routing Subscriber • Less complex subscriptions • More time and spaceefficient filtering Motivation Subscription Pruning Selectivity Estimation Evaluation Outlook

  12. Routing table Routing table Routing table Routing table Application of Pruning (3) • Forwarding of subscriptions for selective routing Subscriber • But more general subscriptions • More forwarded event messages • More event messages to filter Motivation Subscription Pruning Selectivity Estimation Evaluation Outlook

  13. AND OR OR title like “Harry Potter” endingWithin < 1 day AND AND AND AND AND Remove unary operators price < 10.0 condition = NEW price < 15.0 condition = USED OR OR title like “Harry Potter” endingWithin < 1 day condition = USED AND AND condition = NEW price < 15.0 Example of Pruning (1) • Valid pruning - Remove child of conjunction Motivation Subscription Pruning Selectivity Estimation Evaluation Outlook

  14. AND OR OR title like “Harry Potter” endingWithin < 1 day AND AND AND AND Remove unary/ Summarize consecutive operators price < 10.0 AND condition = NEW price < 15.0 condition = USED title like “Harry Potter” endingWithin < 1 day condition = NEW price < 15.0 Example of Pruning (2) • Invalid pruning - Remove child of disjunction No filtering of used books anymore! Motivation Subscription Pruning Selectivity Estimation Evaluation Outlook

  15. Challenges of Pruning • Questions 1. Which subscription should be pruned first? 2. Which part of a subscription should be pruned? • Answer The subscription (1.) supporting a pruning (2.) that minimally influences the network traffic  Utilize selectivities of subscriptions to determine effects of pruning on network load Motivation Subscription Pruning Selectivity Estimation Evaluation Outlook

  16. Structure • Motivation • Subscription Pruning • Selectivity Estimation • Evaluation of Subscription Pruning • Summary and Outlook Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

  17. Selectivity of Subscriptions • Calculation of selectivity for • Original subscriptions - Counting • Predicates - Counting/Approximation • Pruned subscriptions - No suitable method  Estimate selectivities for subscriptions MotivationSubscription Pruning Selectivity Estimation Evaluation Outlook

  18. Selectivity Estimation: Idea • Using three easily computable estimates • Minimalselmin Worst case - smallest possible selectivity value for all distributions of events • Averageselavg Average case - assuming uniform distribution of all possible event messages and independent predicates in subscriptions • Maximalselmax Best case - largest possible selectivity value for all distributions of events MotivationSubscription Pruning Selectivity Estimation Evaluation Outlook

  19. 0.01 0.1 (0.13, 0.19, 0.2) (0.7, 0.72, 0.8) 0.2 0.93 0.9 0.8 Selectivity Estimation: Example • Selectivity of predicates via counting • Selectivity of subscriptions via estimation (0.0, 7.7e-4, 0.01) AND title like “Harry Potter” OR endingWithin < 1 day (0.7, 0.77, 1.0) AND AND price < 10.0 condition = NEW price < 15.0 condition = USED MotivationSubscription Pruning Selectivity Estimation Evaluation Outlook

  20. Selectivity Degradation • Absolute degradation when pruning sx to sy • Describes expected influence on network load • Maximal difference between three components max( selmin(sy) - selmin(sx), selavg(sy) - selavg(sx), selmax(sy) - selmax(sx)) MotivationSubscription Pruning Selectivity Estimation Evaluation Outlook

  21. Structure • Motivation • Subscription Pruning • Selectivity Estimation • Evaluation of Subscription Pruning • Summary and Outlook Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

  22. Experiments: Goal • Evaluate general setting • Real-world subscriptions • Real-world attribute domains • Initial set of experiments • Evaluation of memory usage and real selectivity • Real selectivity shows expected network load MotivationSubscription Pruning Selectivity Estimation Evaluation Outlook

  23. Experiments: Setup • E-commerce setting (online book auctions) • Ten attributes, e.g., author, format and price • Events • Analysis of real-world distributions • Average for 1,000,000 messages • Subscriptions • Three typical classes involving conjunctions and disjunctions • 10,000 registered subscriptions MotivationSubscription Pruning Selectivity Estimation Evaluation Outlook

  24. Experiments: Results (1) • Setting involving all three subscription classes Expected increase in network load  Memory usage Cut-off point MotivationSubscription Pruning Selectivity Estimation Evaluation Outlook

  25. Experiments: Results (2) • At cut-off point (Column 4) • Slight increase in selectivity (Column 2) • Strong reduction in memory usage (Column 3) Subscription class Class 1 Class 2 Class 3 Class 1–3 Increase in selectivity 0.009 0.012 0.016 0.026 Relief in memory 0.667 0.833 0.368 0.663 Cut-off point at percent of prunings 0.750 0.875 0.525 0.771 MotivationSubscription Pruning Selectivity Estimation Evaluation Outlook

  26. Structure • Motivation • Subscription Pruning • Selectivity Estimation • Evaluation of Subscription Pruning • Summary and Outlook Annika Hinze – Pruning Subscriptions in Distributed Publish/Subscribe Systems

  27. Summary (1) • Motivation • Publish/subscribe (pub/sub) systems • Routing and optimizations in pub/sub • Subscription pruning • Drawbacks of current optimizations • Prune/remove parts of subscription trees • Pruning has to result in more general subscription MotivationSubscription Pruning Selectivity Estimation Evaluation Outlook

  28. Summary (2) • Selectivity estimation • Three values easily computable values • Degradation measure  predicted influence of prunings • Practical analysis • Evaluation of real-world scenario • Setting with all subscriptions • Space usage decreased by 66% of maximal reduction • Only slight increase in network load MotivationSubscription Pruning Selectivity Estimation Evaluation Outlook

  29. Future Work • Integrate pruning in pub/sub prototype • Extended experiments • Measure network load, throughput and memory usage • Other real-world scenarios MotivationSubscription Pruning Selectivity Estimation Evaluation Outlook

  30. Thank you for your attention! Contact: Annika Hinze a.hinze@cs.waikato.ac.nz

More Related