480 likes | 650 Views
Publisher Placement Algorithms in Content-based Publish/Subscribe. Alex King Yeung Cheung and Hans-Arno Jacobsen University of Toronto June, 24 th 2010 ICDCS 2010. MIDDLEWARE SYSTEMS. RESEARCH GROUP. Problem. P. Publishers can join anywhere in the broker overlay Closest broker Impact
E N D
Publisher Placement Algorithms in Content-based Publish/Subscribe Alex King Yeung Cheung and Hans-Arno Jacobsen University of Toronto June, 24th 2010 ICDCS 2010 MIDDLEWARE SYSTEMS RESEARCH GROUP
Problem P • Publishers can join anywhere in the broker overlay • Closest broker • Impact • High delivery delay • High system utilization • Matching • Bandwidth • Subscription Storage S S
Motivation • High system utilization leads to overloads • High response times • Reliability issues • Critical for enterprise-grade publish/subscribe systems • GooPS • Google’s internal publish/subscribe middleware • Supermontage • Tibco’s pub/sub distribution network for Nasdaq’s quote and order processing system • GDSN (Global Data Synchronization Network) • Global pub/sub network to allow retailers and suppliers to exchange supply chain data
Goal P • Adaptively move publisher to area of matching subscribers • Algorithms should be • Dynamic • Transparent • Scalable • Robust S S
Terminology upstream downstream B1 B2 B3 B4 B5 P Reference broker Publication flow
Publisher Placement Algorithms • POP • Publisher Optimistic Placement • Fully distributed design • Retrieves trace information per traced publication • Uses one metric: number of publication deliveries downstream • GRAPE • Greedy Relocation Algorithm for Publishers of Events • Computations are centralized at each publisher’s broker, makes implementing and debugging easier • Retrieves trace information per trace session • Customize on minimizing delivery delay, broker load, or a specificed combination of both • Uses two metrics: average delivery delay and total system message rate • Goal: Move publishers to where the subscribers are based on past publication traffic
Choice of Minimizing Delivery Delay or Load P 100% Load 0% 0% Delay 100% 1 msg/s 4 msg/s [class,=,`STOCK’], [symbol,=,`GOOG’], [volume,>,1000000] [class,=,`STOCK’], [symbol,=,`GOOG’], [volume,>,0] S S S S S S [class,`STOCK’], [symbol,`GOOG’], [volume,9900000]
GRAPE’s 3 Phases • Phase 1 • Discover location of publication deliveries by tracing live publication messages • Retrieve trace and broker performance information • Phase 2 • Pinpoint the broker that minimizes the average delivery delay or system load in a centralized manner • Phase 3 • Migrate the publisher to the broker decided in Phase 2 • Transparently with minimal routing table update and message overhead
Phase 1 – Illustration Publications received at this broker Number of matching subscribers B34-M213 B34-M212 GRAPE’s data structure per publisher 5 B34-M215 B34-M212 Trace session ID 10 B34-M216 B34-M212 B34-M212 Trace session ID 5 Message ID B34-M217 B34-M212 Total number of deliveries made to local subscribers 5 15 0 50 1 B34-M220 B34-M212 Start of bit vector 3 0 0 1 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 B34-M222 B34-M212 20 B34-M212 B34-M225 1 B34-M212 B34-M226 5
Phase 1 – Trace Data and Broker Performance Retrieval Once Gthreshold publications are traced, then the trace session ends… Reply B8 1x 1x S S 9x B8 S Reply B8, B7, B6, B5 Reply B8, B7, B6 B1 B5 B6 5x B7 S P Reply B7
Contents of Trace Reply in Phase 1 • Broker ID • Neighbor ID(s) • Bit vector (for estimating total system message rate) • Total number of local deliveries (for estimating end-to-end delivery delay) • Input queuing delay • Average matching delay • Output queuing delays to neighbor(s) and binding(s) • Message overhead-wise, GRAPE adds 1 reply message per trace session
Phase 2 – Broker Selection • Simulate placing the publisher at every downstream broker and estimate the average end-to-end delivery delay • Local delivery counts • Processing delay at each broker • queuing and matching delays • Publisher ping times to each broker • Simulate placing the publisher at every downstream broker and estimate the total system message rate • Bit vectors
Phase 2 – Estimating Average End-to-End Delivery Delay 20 ms 5 ms 45 ms 25 ms 40 ms 35 ms Input Q: Matching: Output Q (RMI): Output Q (B5): Output Q (B7): Output Q (B8): Subscriber at B1: 10+(30+20+100) ×1 = 160 ms B7 9 S Subscribers at B6: 10+[(30+20+50)+(20+5+45)] ×2 = 350 ms Input Q: Matching: Output Q (RMI): Output Q (B6): 30 ms 10 ms 70 ms 30 ms 1 2 S S Subscribers at B7: 10+ [(30+20+50)+(20+5+40)+ (30+10+70)] ×9 = 2,485 ms B1 B6 35 ms 15 ms 75 ms 35 ms Input Q: Matching: Output Q (RMI): Output Q (B6): Ping time: Subscribers at B8: 10+[(30+20+50)+(20+5+35)+ (35+15+75)] ×5 = 1,435 ms 10 ms P Average end-to-end delivery delay: (150+340+2475+1425) ÷ 17 = 268 ms 30 ms 20 ms 100 ms 50ms Input Q: Matching: Output Q (RMI): Output Q (B5): B8 5 S
Phase 2 – Estimating Total Broker Message Rate Bit vector are necessary in capturing publication deliveries to local subscribers in content-based pub/sub systems 0 0 1 0 0 B7 9 S 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 2 S S B1 B6 Message rate through a broker is calculated by using the OR-bit operator to aggregate the bit vectors of all downstream brokers 0 1 1 1 1 P B8 5 S 1 1 1 1 1 0 1 1 1 1 0 1 1 1 1
Phase 2 – Minimizing Delivery Delay with Weight P% • Get publisher-to-broker ping times • Calculate the average delivery delay if the publisher is positioned at each of the downstream brokers • Normalize, sort, and drop candidates with average delivery delays greater than 100 - P • Calculate the total broker message rate if the publisher is positioned at each of the remaining candidate brokers • Select the candidate that yields the lowest total system message rate
POP’s 3 Phases • Phase 1 • Discover location of publication deliveries by probabilistically tracing live publication messages • Phase 2 • Pinpoint the broker closest to the set of matching subscribers using trace data from phase 1 in a decentralized fashion • Phase 3 • Migrate the publisher to the broker decided in Phase 2 • Transparently with minimal routing table update and message overhead
Multiple publication traces are aggregated by : Si = Snew + (1 - α) Si-1 Phase 1 – Publication Tracing Publisher Profile Table 2x B3 S 1x 1x S S 4x 9x Reply 9 B2 B8 S S Reply 15 Reply 15 B1 B5 B6 3x 5x B4 B7 S S Reply 5 P
Phase 2 – Broker Selection 2x B3 S AdvId: P DestId: null Broker List: B1, B5, B6 10 1x 1x B6 S S 4x 9x B2 B8 S S B1 B5 B6 3x 5x B4 B7 S S P
Experiment Setup • Experiments on both PlanetLab and a cluster testbed • PlanetLab: • 63 brokers • 1 broker per box • 20 publishers with • publication rate of 10 – • 40 msg/min • 80 subscribers per • publisher • 1600 subscribers in total • Pthreshold of 50 • Gthreshold of 50 • Cluster testbed: • 127 brokers • Up to 7 brokers per box • 30 publishers with • publication rate of 30 – • 300 msg/min • 200 subscribers per • publisher • 6000 subscribers in total • Pthreshold of 100 • Gthreshold of 100
Experiment Setup - Workloads • 2 workloads • Random Scenario • 5 % are high-rated; sink all traffic from their publisher • 25% are medium-rated; sink ~50% of traffic • 70% are low-rated; sink ~10% of traffic • Subscribers are randomly placed on N brokers • Enterprise Scenario • 5 % are high-rated; sink all traffic from their publisher • 95% are low-rated; sink ~10% of traffic • All high-rated subscribers are clustered onto one broker, and all low-rated subscribers onto N-1 brokers
Average Input Utilization Ratio vs Subscriber Distribution Graph Load reduction by up to 68%
Average Delivery Delay vs Subscriber Distribution Graph Delivery delay reduction by up to 68%
Average Message Overhead Ratio vs Subscriber Distribution Graph
Conclusions • POP and GRAPE moves publishers to areas of matching subscribers to • Reduce load in the system to increase scalability, and/or • Reduce average delivery delay on publication messages to improve performance • POP is suitable for pub/sub systems that strive for simplicity, such as GooPS • GRAPE is suitable for systems that strive to minimize in the extremes, such as system load in sensor networks and delivery delay in SuperMontage, or want the flexibility to adjust the performance and based on resource usage
Related Approaches • Filter-basedPublish/Subscribe: • Re-organize the broker overlay to minimize delivery delay and system load. • R.Baldoniet al. The Computer Journal, 2007. • Migliavaccaet al. DEBS 2007. • Multicast-based Publish/Subscribe: • Assign similar subscriptions to one or more cluster of servers • Suitable for static workloads • May get false-positive publication delivery • Architecture is fundamentally different than filter-based approaches • Riabovet al. ICDCS 2002 and 2003 • Voulgariset al. IPTPS 2006 • Baldoniet al. DEBS 2007
Average Broker Message Rate VS Subscriber Distribution Graph
Average Output Utilization Ratio VS Subscriber Distribution Graph
Average Broker Message Rate VS Subscriber Distribution Graph
Results Summary • Under random workload • No significant performance differences between POP and GRAPE • Prioritization metric and weight has almost no impact on GRAPE’s performance • Increasing the number of publication samples on POP • Increases the response time • Increases the amount of message overhead • Increases the average broker message rate • GRAPE reduces the input util ratio by up to 68%, average message rate by 84%, average delivery delay by 68%, and message overhead relative to POP by 91%.
Phase 1 – Logging Publication History • Each broker records, per publisher, the publications delivered to local subscribers • Each trace session is identified by the message ID of first publication of that session • The trace session ID is in the header of each subsequent publication message • Gthreshold publications are traced per trace session
POP - Intro • Publisher Optimistic Placement • Goal: Move publishers to the area with highest publication delivery or concentration of matching subscribers
POP’s Methodology Overview • 3 phase algorithm: • Phase 1: Discover location of publication deliveries by probabilistically tracing live publication messages • Ongoing, efficiently with minimal network, computational, and storage overhead • Phase 2:Pinpoint the broker closest to the set of matching subscribers using trace data from phase 1 in a decentralized fashion • Phase 3: Migrate the publisher to the broker decided in phase 2 • Transparently with minimal routing table update and message overhead
Multiple publication traces are aggregated by : Si = Snew + (1 - α) Si-1 Phase 1 – Aggregated Replies Publisher Profile Table 2x B3 S 1x 1x S S 4x 9x Reply 9 B2 B8 S S Reply 15 Reply 15 B1 B5 B6 3x 5x B4 B7 S S Reply 5 P
Phase 2 – Decentralized Broker Selection Algorithm • Phase 2 starts when Pthresholdpublications are traced • Goal: Pinpoint the broker that is closest to highest concentration of matching subscribers • Using trace information from only a subset of brokers • The Next Best Broker condition: • The next best neighboring broker is the one whose number of downstream subscribers is greater than the sum of all other neighbors' downstream subscribers plus the local broker's subscribers.
Phase 2 – Example 2x B3 S AdvId: P DestId: null Broker List: B1, B5, B6 10 1x 1x B6 S S 4x 9x B2 B8 S S B1 B5 B6 3x 5x B4 B7 S S P
Phase 3 - Example Update last hop of P to B6 Remove all S with B6 as last hop 2x B3 S 1x 1x S S 4x 9x B2 B8 S S (1) Update last hop of P to B6-x Update last hop of P to B6 Remove all S with B5 as last hop Forward (all) matching S to B5 B1 B5 B6 3x 5x B4 B7 S S P DONE How to tell when all subs are processed by B6 before P can publish again?
Phase 2 – Minimizing Load with Weight P% • Calculate the total broker message rate if the publisher is positioned at each of the downstream brokers • Normalize, sort, and drop candidates with total message rate greater than 100 - P. • Get publisher-to-broker ping times on remaining candidates • Calculate the average delivery delay if the publisher is positioned at each of the remaining downstream brokers • Select the candidate that yields the lowest average delivery delay.