630 likes | 829 Views
MIDDLEWARE SYSTEMS. RESEARCH GROUP. Data-centric Networking Through Adaptive Content-based Routing. Hans-Arno Jacobsen Bell University Laboratory Chair Middleware Systems Research Group University of Toronto. http://www.padres.msrg.utoronto.ca. Querying the Future.
E N D
MIDDLEWARE SYSTEMS RESEARCH GROUP Data-centric Networking Through Adaptive Content-based Routing Hans-Arno Jacobsen Bell University Laboratory Chair Middleware Systems Research Group University of Toronto http://www.padres.msrg.utoronto.ca University of Oslo, February 2009
Querying the Future University of Oslo, February 2009
Amazon to Chapters to You .... Monday, October 10th in Cyberspace Thursday, November 15th, in Toronto Your book “...” is available at .... $10 off University of Oslo, February 2009
Business Process Example Loan Application Processing Store inDB … Reject Creditcheck 2 Checkscore Checkscore 2 Creditcheck Approve Send toofficer else … else University of Oslo, February 2009
Large-scale Business Processes Vendor Goods selection Goods delivery Dispatch B Packaging Pick-up goods Out-stock B FedEx Delivery Pick up Sale prediction Sign Contract Sale Fill order Determinate plan Process Check order CCC administrate Fill out-stock bill Check stock Manufactory Confirm features Design Fill dispatch bill Determinate plan Control Prototype Out Take Raw materials Execute plan Warehouse Material Out-stock B Pay Credit card Check Assign Audit Process control Make plan Target price Signature Raw Checkdealer Checkcredit Finance Confirm Approval Approval Monitoring Feature selection Print receipt Validate Statistic Monitor Marketing Requirement collection Feedback Affirm order Chart Strategy Design Marketing Manufactory Order Payment University of Oslo, February 2009
Many applications are driven by asynchronous state transitions. Something happens, … an appropriate reaction is expected and required. Asynchronous state transitions represent events. A process is triggered, a request submitted, … Many applications require event management and processingcapabilities to run effectively. What is the Common Denominator? University of Oslo, February 2009
These applications are driven by events Information matching the query is found and indexed Person walks by a bookstore Loan request is submitted online Abstractly speaking events are disseminated and filtered against queries In Terms of the Examples queries events University of Oslo, February 2009
What Event Processing Support is Required ? • De-coupling and loose coupling • Fine-grained event filtering • In-network event processing • Composite event detection • Event correlation University of Oslo, February 2009
A E B C F D Many Applications are Event-based Workflows, business processes and job scheduling Supply chain and logistics Job A done In flight Trigger Delivered Fault Order Event-Based Callback Razor SKU Invoke Loan Light Temperature Transform Service oriented architectures RFID and sensor networks University of Oslo, February 2009
Agenda • What is the right abstractions? • My point of view • The PADRES project • Some details & results University of Oslo, February 2009
What Abstractions Do Not Work? • Databases • Great for managing historic data • But what about future data • Data streams • Great for managing structured streams of tuples • But what about un-structured, multi-typed, sporadic events from many sources • Rule-based expert systems • Great for inference and reasoning • But what about managing large numbers of fined-grained filters in distributed envrionments Take this cum gran salis University of Oslo, February 2009
What Abstractions Enable Event Processing? • The afore-mentioned points can best be addressed by • The content-based publish/subscribe model • Realized by content-based message routing • Events are conveyed as publications. • Event listening, filtering and correlating is based on content-basedsubscriptions managed by the pub/sub system. University of Oslo, February 2009
Publish/Subscribe 101 • Not all publish/subscribe is equal • Publish/Subscribe models and evolution • Channel-based • OMG CORBA Event Service, … • Topic-based • WS Notifications, OMG Data Dissemination Service … • Type-based • OMG Data Dissemination Service (partially), … • Content-based • The PADRES ESB (see below), … • State-based • Subject Spaces University of Oslo, February 2009
Notification Notification Content-based Publish/Subscribe TSX Stock markets NASDAQ NYSE Publisher Publisher AMGN=58 Publications IBM=84 ORCL=12 JNJ=58 HON=24 INTC=19 MSFT=27 Broker(s) Subscriptions: IBM > 85 ORCL < 10 JNJ > 60 Subscriptions Subscriber Subscriber University of Oslo, February 2009
The Content-based Pub/Sub Model • Language and data model • Boolean functions over predicates • Subscriptions are conjunctions of predicates • Publications are sets of attribute-value pairs • Matching semantic • A subscription matches if all its predicates match University of Oslo, February 2009
That’s Like Data Base Querying !! query publication data tuples subscriptions About past About future sets of tuples sets of tuples Query and subscription are very similar. Data tuples and publication are very similar. However, the two problem statements are inverse. University of Oslo, February 2009
Content-based Message Routing [class,=,stock],[symbol,=,YHOO] [class,=,stock],[symbol,=,YHOO],[price,>,20.0] S1 A1 S1 P1 [class, stock],[symbol, YHOO],[price,25.0] [class, stock],[symbol, YHOO],[price,45.0] [class, stock],[symbol, MSFT],[price, 55.0] P2 S2 A2 S2 [class,=,stock],[price,>,40.0] [class,=,stock],[symbol,=,MSFT],[price,>,50.0] Event-Based Content Routing Flexible Decoupled Declarative Responsive University of Oslo, February 2009
Publication Space Sub intersecting Adv Pub matching Sub height height 90 75 70 70 20 25 20 32 weight weight Adv: [height > 70],[weight > 25] Sub: [height > 75],[weight > 20] Sub: [height > 75],[weight > 20] Pub: [height , 90],[weight , 32] University of Oslo, February 2009
ToPSS - The Toronto Publish/Subscribe System Family [2000 – present] • Matching algorithms • Language expressiveness vs. efficient matching • Routing protocols • Network architectures & scalability • Higher level abstractions • Workflow execution • Monitoring A-ToPSS (approximate) ToPSS (matching) CS-ToPSS (composite subs) S-ToPSS (semantic) L-ToPSS (location-based) Rb-ToPSS (rule-based) X-ToPSS (XML matching) persistent-ToPSS (subject spaces) M-ToPSS (mobile) P2P-ToPSS (peer-to-peer) LB-ToPSS (load balancing) Federated-ToPSS (federation of ToPSS brokers) Ad hoc-ToPSS (ad hoc networking) Historic-ToPSS (historic data) FT-ToPSS (fault tolerance) JS-ToPSS (job scheduling) BPEL-ToPSS (BPEL execution) University of Oslo, February 2009
PADRES Data-centric Event Bus • First generation of students, when I looked away • Peng Alex David aRno Eli Serge • PADRES is Publish/subscribe Applied to Distributed Resource Scheduling • PAdres is Distributed REsource Scheduling • http://www.padres.msrg.utoronto.ca • http://padres.msrg.utoronto.ca Acknowledgements University of Oslo, February 2009
PADRES Architecture 21 University of Oslo, February 2009 2014/9/6
PADRES Event Bus • Consists of pub/sub message brokers • Content-based publish/subscribe interface • Content-based message routing • Store-and-forward message queuing • Comprised of a federation of brokers deployed as overlay • Offers a slim client library for applications • Soon available under an open (source) license model and as Apache Poloka incubation project University of Oslo, February 2009
P = publisher = subscriber S P S dest2 B Matching Engine B + Publications Routing Table B dest1 output queue dest2 input queue subscription dest B B dest3 S P output queue dest3 PADRES Event Broker temperature > 37 dest2 temperature > 40 dest3 temperature = 36 temperature = 38 temperature = 42 University of Oslo, February 2009
Event Broker Architecture PADRES Broker Input Queue Output Queues Matcher Adv Sub SRT Sub Pre Processor Post Processor Forwarder Sub Pub Pub/Sub Messages PRT Pub Queue Handler Queue Handler University of Oslo, February 2009
A E B C F D Innovative PADRES Features HistoricAccess Management CompositeEvents Security Robustness LoadBalancing University of Oslo, February 2009
Limitations of Acyclic Overlays • Sensitive to • Congestion • Imbalanced workloads • Broker failures • Overlay changes P Broker Publisher Subscriber University of Oslo, February 2009
General Overlay Network Robust Flexible Self-healing Adaptive P P P P Publisher Subscriber Congested Link University of Oslo, February 2009
Challenges with General Overlays S X • Subscriptions route in loops • Brokers receive duplicate subscriptions • Multiple copies of message maybe created • Same problem for publications Adv 1 S 1 2 3 4 S 5 6 Adv 2 University of Oslo, February 2009
Number of Redundant Messages University of Oslo, February 2009
Content-based Routing in General Overlays • Maintain the same interface to pub/sub clients • Develop content-based routing protocols for • Advertisement • Subscription • Publication University of Oslo, February 2009
Advertisement Routing • Each advertisement forms a spanning advertisement tree • Duplicate advertisements are discarded by brokers • Each advertisement is assigned a unique tree identifier (TID) • e.g., A: [class,=,stock]……[TID,=,adv_msg_id] • SRT (Subscription Routing Table) • A set of [advertisement, last hop] pairs University of Oslo, February 2009
Subscription Routing I • Each subscription is augmented with a TID-predicate with a variable • e.g., S: [class,=,stock] … [TID,=,$X] • The variable is bound to the TID of matching advertisements • PRT (Publication Routing Table) • A set of • [subscription, { (TID, last hop of subscription), … } ] pairs University of Oslo, February 2009
Subscription Routing II S: [class,=,stock],[name,=,*],[price,>,50], [TID,=,$Z] At Broker 1: Adv1: [class,=,stock],[name,=,IBM], [price,>,60],[TID,=,Adv1] Adv2: [class,=,stock],[name,=,HP], [price,>,50],[TID,=,Adv2] S matching Adv1: [class,=,stock],[name,=,*], [price,>,50],[TID,=,Adv1] S matching Adv2: [class,=,stock],[name,=,*],[price,>,50], [TID,=,Adv2] S X S 1 2 Adv 1 3 4 5 6 Adv 2 University of Oslo, February 2009
Publication Routing • Each publication is assigned the TID of its inducing advertisement • e.g., P [class, stock]……[TID, adv_msg_id] • Publication routing protocols: • Fixed TID routing: a publication is routed to subscribers along its advertisement tree. • Dynamic publication routing: a publication may be routed to subscribers across branches of different advertisement trees. University of Oslo, February 2009
Fixed TID Routing X Adv 1 P Property 1: No broker receives duplicate publication messages. Adv 2 1 2 P 3 4 5 6 Sub University of Oslo, February 2009
Dynamic Publication Routing • Publication’s TID can be changed in transit. • ``Best`` path algorithms • Property 2: Changing a publication P’s TID while in transit will not change the set of subscribers notified of P. X Adv 1 Adv 2 1 2 P 3 4 5 6 Sub University of Oslo, February 2009
Faster Matching with TIDs • Subscriptions are augmented with TIDs only once at the first broker. • Other brokers can route the subscription based on the TID alone. • Similar argument applies to publication routing. University of Oslo, February 2009
Advantages • Simple and powerful concept • Retain the publish/subscribe client interface • Speed up subscription and publication propagation • Generate duplicated messages only at advertisement level • Build multiple subscription routing paths for publications • Route publications dynamically University of Oslo, February 2009
Composite Subscription AND S5 AND OR OR S1 S2 S3 S4 Composite subscriptions (CS) are used for event correlation, in network filtering, and the detection of composite events (complex event). A composite event is the constellation of events being detected by the composite subscription. Applications: Business process management, Business activity monitoring CS={ {S1 OR S2} AND {S3 OR S4} AND S5 } S are atomic subscriptions. I.e., they are satisfied by a single, multi-attribute event. University of Oslo, February 2009
Topology-based CS Routing Adv 3 Adv 2 1 2 7 S2 CS’ S3 3 4 8 S1 5 6 9 Adv 1 CS CS={ {S1 AND S2} AND S3 } University of Oslo, February 2009
Adaptive CS Routing • CSs may be split according to potential publication traffic, bandwidth, latency etc. Adv 2 Adv 2 2 2 1 3 1 3 Adv 1 Adv 1 CS={S1 AND S2} CS={S1 AND S2} (b) (a) University of Oslo, February 2009
Adaptive CS Routing Adv 3 Adv 2 CS’ 1 2 7 CS’ S3 3 4 8 S2 5 5 6 9 Adv 1 S1 CS CS={{S1 AND S2} AND S3} University of Oslo, February 2009
Evaluation • 32 overlay brokers, 20 publishers, 30 subscribers, initially • 20 machine vs. PlanetLab • Workload • http://research.msrg.utoronto.ca/Padres/DataSets • Yahoo!Finance stock quote traces University of Oslo, February 2009
Dense Topology University of Oslo, February 2009
On PlanetLab University of Oslo, February 2009
Increased Publication Rate University of Oslo, February 2009
With Broker Failures University of Oslo, February 2009
Composite Event Detection University of Oslo, February 2009
Conclusions • The right abstraction for event processing is content-based publish/subscribe. • Event processing & publish/subscribe are interesting research areas. • ToPSS and PADRES explore many aspects of these areas. • http://padres.msrg.utoronto.ca University of Oslo, February 2009
Acknowledgements Graduate students, visitors, and PDFs currently working on PADRES. Alex Cheung Chen Chen Amer Farroukh Patrick Lee Guoli Li Bala Maniymaran Vinod Muthusamy Reza Sherafat Naweed Tajuddin Chunyang Ye Young Yoon • Partners from CA • Serge Mankovskii & Kirk Wilson • Partners from IBM • Phil Coultard & Allen Chan • Partners from Bell • Bell Systems & Technology Plus many PADRES alumni