Flexible Distributed Business Process Management

Flexible DistributedBusiness Process Management Thesis Defense September 23, 2011 VinodMuthusamy University of Toronto

Composite applications Mashups Service-orientedarchitectures Cloudcomputing

Distributed business processes Vendor Goods selection Goods delivery Dispatch B Packaging Pick-up goods • Case Study (Chinese Electronics Manufacturer): • Global processes that compose departmental ones • Department-level processes with 26 to 47 activities • Thousands of concurrent instances • Hundreds of collaborating partners • Geographically distributed • Administrative boundaries Out-stock B FedEx Delivery Pick up Sale prediction Sign Contract Sale Fill order Determinate plan Process Check order CCC administrate Fill out-stock bill Check stock Manufactory Confirm features Design Fill dispatch bill Determinate plan Control Prototype Out Take Raw materials Execute plan Warehouse Material Out-stock B Pay Credit card Check Assign Audit Process control Make plan Target price Signature Raw Check dealer Check credit Finance Confirm Approval Approval Monitoring Feature selection Print receipt Validate Statistic Monitor Marketing Requirement collection Feedback Affirm order Chart Strategy Design Marketing Manufactory Order Payment

Process execution architectures A B C D Process Centralized • One execution engine • May not scale • Central point of failure AB C D Clustered • Replicated engines • Centralized control and data • High bandwidth and latency • Still may not scale • Administrative limitations AB C D AB C D

Process execution architectures (2) A B C D Process • Distributed set of execution engines • Flexible deployment • Performance, security, etc. • Process fragments can be modified • Lower bandwidth and latency • Fine-grained use of resources • No single point of failure • In-network processing Distributed A,B D C Our relevant publications: ACM TWEB 2010, CASCON 2009, IEEE ICDCS 2009, ACM DEBS 2009, CASCON 2008, ACM DEBS 2008, CASCON 2008, ACM MobiCom 2005

Distributed process execution stack Service discovery Service composition SLA Modelling Distributed execution Activity placement Transactional movement Filter aggregation

B B Matching Engine & B Routing Table Output queue d2 Input queue Notification Notification subscription dest Class = Loan d2 B Service RTT > 1s d3 B Output queue d3 Content-based routing Content-based Publish/Subscribe 3. Publish Publisher Publisher Service = credit check Publications 1. Advertise class = Loan, status = approved, Service RTT = 2s amount = $500K class = Loan Subscriptions Service RTT > 1s 2. Subscribe Subscriber Subscriber

Agenda Transactional movement Low overhead Without interrupting process Distributed process execution Process decomposition Event-based coordination Event-based service discovery Four discovery models Exploit similarity

A movement operation relocates a client (including its state) Client Container 1 Client Container 7 MOVE (to Broker 7) Client A Client A B1 B7 ∙∙∙ B2 B8 Client B • Movement may fail because target broker rejects it, etc. • During movement, there may be multiple copies of a client, but only one should be “running” at any time • Client container helps coordinate the movement

ACID-like transactional properties for various layers Atomicity Consistency Isolation Client layer A moving client must be exclusively either at its source or target broker. There must be at most one running instance of each client. Only the initial or final states of a moving client should be observable. Notifications layer Notifications are delivered exactly once to either the source or target client. A moving client (whether successful or not) should receive the same set of notifications as one that did not move. The set of notifications delivered to stationary clients from a moving publisher should be independent of whether the publisher successfully moves. Routing layer Either all or none of the set of routing table updates required for an operation (adv, sub, etc.) should all be applied. Each broker should have the minimal set of routing table entries for a set of advs and subs. A movement should only modify those routing table entries associated with the moving client. • Durability is omitted but can be achieved by persisting all state to stable storage • Refer to ICDCS 2009 paper for formal definitions

Client and coordinator states at the source and target • Coordinators are based on the three-phase commit protocol • The failure and timeout transitions are omitted for brevity

The global reachable state graph can be used to prove some of the transactional properties • E.g., in the commit state, the source client is clean, and the target client is started • Refer to ICDCS 2009 paper for proofs

Movement can trigger burst of covered messages A B C P P B S P B1 Bi A′ Bl Bj A A A′ C B2 S S S

Evaluations

The reconfiguration protocol is much faster than the covering protocol • Movement of “root” subscriptions is more expensive in the covering protocol

The reconfiguration protocol scales with the number of moving clients • The reconfiguration protocol achieves better movement latency despite more total messages because it is less bursty

Summary of transactional movement • ACID-like transactional properties provide well-defined guarantees for the movement of clients • Properties are modularized to simplify reasoning and implementation • Client layer movement and routing layer hop-by-hop reconfiguration protocols were developed • Evaluations show proposed protocol is more efficient and stable with respect to various parameters • End-to-end movement using covering negatively affects performance

Distributed process execution stack

Agenda Transactional movement Low overhead Without interrupting process Distributed process execution Process decomposition Event-based coordination Event-based service discovery Four discovery models Exploit similarity

Coordinating process control flow – atomic subscription activity2 publication: class = ACTIVITY_STATUSprocess = ‘Process1’activityName = ‘activity2’instanceId = ‘g001’status = ‘SUCCESS’ Process 1 activity1 flow1 activity2 activity5 activity3 subscription: class = ACTIVITY_STATUSprocess = ‘Process1’activityName = ‘activity2’instanceId= *status = ‘SUCCESS’ activity3 activity6 activity4 activity7 activity8

Coordinating process control flow – composite subscription activity6 subscription: class = ACTIVITY_STATUS,process = ‘Process1’,activityName = ‘activity5’,instanceId = $X,status = ‘SUCCESS’ AND class = LINK_STATUS,process = ‘Process1’,activityName = ‘activity2’,instanceId = $X,status = * Process 1 activity1 flow1 activity2 activity5 activity3 activity6 activity4 activity7 activity8

BPEL transformation example Sub1: [class,eq,ACTIVITY_STATUS], [process,eq,’Process5’], [activityName,eq,’activity1’], [IID,isPresent,any], [status,eq,’SUCCESS’] Pub3: [class,LINK_STATUS], [process,’Process5’], [activityName,’actitiy2’], [IID,’g001’], [status,’POSITIVE’] Sub5: [class,eq,ACTIVITY_STATUS], [process,eq,’Process5’], [activityName,eq,’activity4’], [IID,isPresent,any], [status,eq,’SUCCESS’] && [class,eq,ACTIVITY_STATUS], [process,eq,’Process5’], [activityName,eq,’activity7’], [IID,isPresent,any], [status,eq,”SUCCESS”] Process 1 activity1 Pub1: [class, ACTIVITY_STATUS], [process,’Process5’], [activityName,’flow1’], [IID,’g001’], [status,’STARTED’] Sub4: [class,eq,ACTIVITY_STATUS], [process,eq,’Process5’], [activityName,eq,’activity2’], [IID,eq,$X], [status,eq,’SUCCESS’] && [class,eq,LINK_STATUS], [process,eq,’Process5’], [activityName,eq,’activity2’], [IID,eq,$X], [status,isPresent,any] flow1 activity2 activity5 Sub2: [class,eq,ACTIVITY_STATUS], [process,eq,’Process5’], [activityName,eq,’flow1’], [IID,isPresent,any], [status,eq,’STARTED’] Pub5: [class, ACTIVITY_STATUS], [process,’Process5’], [activityName,’actitiy7’], [IID,’g001’], [status,’SUCCESS’] activity3 activity6 activity4 activity7 activity8 Pub2: [class, ACTIVITY_STATUS], [process,’Process5’], [activityName,’actitiy2’], [IID,’g001’], [status,’SUCCESS’] Pub4: [class, ACTIVITY_STATUS], [process,’Process5’], [activityName,’actitiy6’], [IID,’g001’], [status,’SUCCESS’] See ACM Trans. Web 2010 paper for full BPEL mapping

Distributed process execution BPEL Process Receive END WS Gateway Agent Assign Flow Invoke Wait Web Service client Reply PADRES ESB 6 Invoke 1 3 4 Web Service 2 5 Receive Assign Reply Wait pub/sub http/soap

The distributed deployment scales better with request rates

Summary of distributed process execution engine • Many large processes are inherently distributed • Multiple partners, many administrative domains, geographically dispersed • Distributed architecture offers better scalability for large and highly concurrent processes • Provides resource allocation flexibility

Distributed process execution stack

Agenda Transactional movementLow overhead Without interrupting process Distributed process execution Process decomposition Event-based coordination Event-based service discovery Four discovery models Exploit similarity

Supported models Discoveries One-timediscovery Continuousdiscovery Static(e.g., find weather service) Static continuous(e.g., monitorreal estate) Staticresource Resources Dynamic(e.g., find micro-generation power) Dynamic continuous(e.g., monitor grid resources) Dynamicresource Event-based

Architecture Advertise all attributes: system = linux memory <= 2000 disk <= 320 Resource providers act as publishers Publish updates of dynamic attributes: memory = 1500 disk = 80 Distributed Content-Based Publish/Subscribe B4 B1 B3 B5 B2 Subscribe for resources: system = linux disk >= 200 Discovery clients act as subscribers 29

Static model Publication Subscription: memory > 1 Advertisement: system = linux, memory = 2, disk = 320 B4 B1 B3 B5 B2 Discovery is performed locally by any single broker. 30

Dynamic continuous model Subscription: memory > 1 Publication: memory = 1, disk = 200 Advertisement: system= linux, memory <= 2, disk < 320 B4 B1 B3 B5 B2 Traditional pub/sub routing of messages. Discovery subscription is routed to and stored at matching resource host brokers. 31

Similarity forwarding Sub2: memory > 2 Adv: system= linux, memory <= 2, disk < 320 Sub1: memory > 1 Pub: memory = 1, disk = 200 B1 B4 B3 B5 B2 To retrieve old results: Send covered sub to the covering sub’s discovery host broker. To intercept new results: Store covered sub at the first broker with a covering sub. 32

Discovery time 3.0 Normal(B) Similarity(B) 2.5 Normal(U) Similarity(U) 2.0 1.5 Avg discovery time (s) 1.0 0.5 0.0 10 20 30 40 50 60 70 80 90 100 Discovery similarity (%) Balanced spatial distribution of discoveries 3.0 Normal(B) Similarity(B) 2.5 Normal(U) Similarity(U) 2.0 Avg discovery time (s) 1.5 1.0 0.5 0.0 10 20 30 40 50 60 70 80 90 100 Discovery similarity (%) Clustered spatial distribution of discoveries • Similarity forwarding optimization is faster • Increased discovery similarity • Normal algorithm suffers • More matching resources are found • Optimized algorithm benefits • Reuse results • Spatial clustering of resources • Normal algorithm benefits • Smaller subscription propagation tree (more “multicast”) • Optimized algorithm benefits slightly • Results are often retrieved from discovery host broker • Spatial clustering of discoveries • Normal algorithm suffers • Congestion of messages near discovery host brokers • Optimized algorithm suffers slightly • Matching of cached results is relatively cheap

Summary of service discovery • Distributed event-based resource discovery framework offers: • Parallel discovery of static resources • Efficient dissemination of dynamic resource attributes • Real-time discovery of new resources • Similarity optimization reuses results of overlapping • Exploits publish/subscribe covering techniques • Benefits from more skewed spatial and interest distributions • The distributed architecture achieves faster discovery at the expense of increased network traffic

Conclusions

Distributed process execution stack Service discovery Service composition SLA Modelling Distributed execution Activity placement Transactional movement Filter aggregation

Summary • Dynamic publish/subscribe clients can move and change interests quickly and cheaply • Fast movement primitive with transactional properties • Congestion avoidance with incremental filter aggregation • Distributed process execution offers more flexible deployment options • Distributed architecture with event-based coordination • Dynamic utility-aware process redeployment • Service management can operate on distributed set of service registries • Distributed service discovery including continuous discovery model • Distributed service composition based on publish/subscribe matching

Future research directions • Consider impact of contention among shared resources • New service impacts existing applications • New application impacts existing applications • Isolate application performance • Develop a hybrid process execution architecture • Allow for both replication and distribution • Support process, activity, instance granularities • Tightly integrate service discovery and composition • Support continuous composition based on continuous discovery results • Allow automatic service binding at runtime based on continuous discovery results

Thank you

Flexible Distributed Business Process Management

Flexible Distributed Business Process Management

Presentation Transcript

Distributed process management: Distributed deadlock

Business Process Management

Business Process Management

Business Process Management

Business Process Management

Distributed Process Management

Distributed Process Management

Flexible Session Management in a Distributed System

Business Process Management

Flexible Distributed Business Process Management

Business Process Management

Business Process Management

Business Process Management

Business process management

Distributed Process Management

Distributed Process Management