260 likes | 272 Views
This research paper discusses using incremental filter aggregation to reduce congestion in content-based routing networks, comparing with traditional algorithms, such as covering and aggregation in IP networks. The study explores the benefits of selectively pruning subscription trees to avoid false positives and subscription bursts, leading to reduced routing table size and processing time.
E N D
MIDDLEWARE SYSTEMS RESEARCH GROUP Congestion Avoidance with Incremental Filter Aggregation in Content-Based Routing Networks Mingwen Chen1, Songlin Hu1, Vinod Muthusamy2, Hans-Arno Jacobsen3 1Chinese Academy of Sciences 2IBM T.J. Watson Research Center, 3University of Toronto July 2, 2015, ICDCS • http://padres.msrg.org
Content-based routing S2 S3 F G P1 A B C D E S1 Subscriber Publisher
S1 S2 S3 Subscription covering S2 S3 • Fewer subscription messages • Smaller routing tables • Faster matching F G P1 A B C D E S1 Subscriber Publisher
Bursty traffic Network congestion High broker load Compare with aggregation in IP networks Relatively static hierarchical addresses S1 S2 S3 Problem with unsubscription S2 S3 F G P1 A B C D E S1 Subscriber Publisher
S1 S2 S3 Tradeoff: covering vs. filtering • Suppose we retain covering subscription even after unsubscribe • Avoids bursty subscription traffic • May lead to false positives • Amount depends on similarity between subscriptions • Can we selectively retain portions of subscription tree? S2 S3 P1 A B C D E S1 Subscriber Publisher
Agenda • Incremental filter aggregation algorithm • Record statistics of publications • Compute effective similarity among subscriptions • Selectively prune portions of subscription tree • Quantitative evaluation • Compare with traditional covering algorithm • Incremental filter aggregation reduces congestion • Also reduces routing table size and processing time
S1 S2 S3 Example: benefits of selectively pruning subscription tree • It is possible to eliminate subscriptions triggered by unsubscription of S1 and avoid false positives • To tradeoff subscription bursts and false positives, we need to consider similarity among subscriptions S2 S3 P1 A B C D E S1 Subscriber Publisher
Subscription similarity S: unsubscription S: covered (triggered) subs R: other (intersecting) subs |S ∩ (S U R)| Ø = |S| Congestion Avoidance
Selective pruning • Remove subscription if • Similarity (Ø) is low • Number of triggered subs is low 7 P Ø(B4) Ø(B2) 4 7 P 2 9 P Ø(B1) 1 S P 5 3 10 P 6 11 P • How to compute similarity? • Computing mergers of covered subs is expensive • Need to compute at every hop?
Subscription similarity P(S): publications that match unsubscription P(S): publications that match covered (triggered) subs P(R): publications that match other (intersecting) subs |P*(S)| |P(S)∩(P(S)UP(R))| = Ø = |P(S)| |P(S)| Congestion Avoidance
S1 S2 S3 Recording publication statistics S2 S3 • Publications annotated with • Distance from SHB to nearest broker with another matching sub • Count of matching subscriptions • Distance and count init to 0 F G P1 A B C D E S1 Subscriber Publisher D = 1C = 3 D = 0C = 2 D = 1C = 1 D = 0C = 1 D = 0C = 0 • When publication matches multiple subscriptions • Increment count • Reset distance • Otherwise • Increment distance
S1 S2 S3 Example: publication statistics S2 S3 F G P1 A B C D E S1 Subscriber Publisher
S1 S2 S3 Interpretation of statistics S2 S3 F G P1 A B C D E S1 • Portions of sub’s tree … • ... >= D hops away can be kept without false positives • … < D hops will incur false positives if kept • For given D, broker can calculate false positive rate • D = 1 2/6 false positives • D = 3 1/6 false postiives
S1 S2 S3 When to perform unsubscription • Unsubscription of covered subscription S2 does not trigger a subscription burst • Always perform unsubscription immediately • Unsubscription of covering subscription S1 can trigger a subscription burst • Compute similarity between subscriptions to determine if unsub of S1 should be forwarded another hop
Selective subscription tree pruning • Include following along with unsub message for each (publisher, distance) pair • Publications and false positives • Count total pubs that match sub • Add the publication counts in the list • Count false positives if sub is not removed • If the list entry’s host broker is further than D hops, all pubs are false positives S2 S3 • If similarity threshold = 4/6 • Prune up to B • If similarity threshold = 5/6 • Prune up to D F G P1 A B C D E S1
Result of selective pruning P • Certain subtrees of the original subscription tree are preserved • Can compute similarity in a distributed manner • With little overhead (up to 24 MB in experiments) • False positive rate decreases (similarity increases) downstream • See paper for proof • Justifies preserving contiguous subtrees P P S P P P
Evaluations 19
Experimental setup • Implemented in PADRES pub/sub system • 49 brokers • 80000 subs distributed across edges • 20 unsubs • Algorithms • Active covering • Lazy covering • Subscription packing • Incremental unsubscription • Metrics • Broker input queue length • Broker routing table size • Publication processing time
Why bother with covering? Average message processing time in milliseconds Covering dominates the subscription processing pipeline Covering is worthwhile Congestion Avoidance
Publications experience severe delays (large and long-lived) • Queue lengths follow similar trends (see paper) avg delay: 1000 s avg delay: 0.4 s
Increased covering causes the excessive queue lengths and publication delays 700 covered subs 1400 covered subs 2100 covered subs Congestion Avoidance
Unexpected incidental benefit: incremental algorithm also reduces routing table size • Scenario: mobile subscribers • Workload from Siena paper [1] • 200k subs • Some subs cover ~2000 subs [1] A. Carzaniga and A. L. Wolf, “Forwarding in a content-based network,” in Proceedings of ACM SIGCOMM, 2003.
Conclusions • Common filter aggregation techniques expose a vulnerability in content-based routing networks • Removal of a covering subscription can trigger burst of subscriptions and cause congestion • Proposed solution selectively maintains portions of a subscription tree • Avoid triggered congestion and control false positives • Compute “effective” similarity among subscriptions for a given workload • Evaluations show congestion is virtually eliminated with incremental aggregation algorithm • Also, proposed algorithm reduces routing table size and publication processing time • Future work • Incrementally prune subscription tree over time • Consider more dynamic environments with mobile publishers and subscribers, and changing workloads • More precisely control tradeoff between subscription traffic congestion and false positive publication traffic
Q&A Congestion Avoidance with Incremental Filter Aggregation in Content-Based Routing Networks http://padres.msrg.toronto.edu