390 likes | 405 Views
Explore a thesis defense on composite applications, mashups, and cloud computing in distributed business process management. The study includes a case study on a Chinese electronics manufacturer and different process execution architectures. Learn about distributed process execution, service discovery, event-based coordination, and transactional properties in distributed systems.
E N D
Flexible DistributedBusiness Process Management Thesis Defense September 23, 2011 VinodMuthusamy University of Toronto
Composite applications Mashups Service-orientedarchitectures Cloudcomputing
Distributed business processes Vendor Goods selection Goods delivery Dispatch B Packaging Pick-up goods • Case Study (Chinese Electronics Manufacturer): • Global processes that compose departmental ones • Department-level processes with 26 to 47 activities • Thousands of concurrent instances • Hundreds of collaborating partners • Geographically distributed • Administrative boundaries Out-stock B FedEx Delivery Pick up Sale prediction Sign Contract Sale Fill order Determinate plan Process Check order CCC administrate Fill out-stock bill Check stock Manufactory Confirm features Design Fill dispatch bill Determinate plan Control Prototype Out Take Raw materials Execute plan Warehouse Material Out-stock B Pay Credit card Check Assign Audit Process control Make plan Target price Signature Raw Check dealer Check credit Finance Confirm Approval Approval Monitoring Feature selection Print receipt Validate Statistic Monitor Marketing Requirement collection Feedback Affirm order Chart Strategy Design Marketing Manufactory Order Payment
Process execution architectures A B C D Process Centralized • One execution engine • May not scale • Central point of failure AB C D Clustered • Replicated engines • Centralized control and data • High bandwidth and latency • Still may not scale • Administrative limitations AB C D AB C D
Process execution architectures (2) A B C D Process • Distributed set of execution engines • Flexible deployment • Performance, security, etc. • Process fragments can be modified • Lower bandwidth and latency • Fine-grained use of resources • No single point of failure • In-network processing Distributed A,B D C Our relevant publications: ACM TWEB 2010, CASCON 2009, IEEE ICDCS 2009, ACM DEBS 2009, CASCON 2008, ACM DEBS 2008, CASCON 2008, ACM MobiCom 2005
Distributed process execution stack Service discovery Service composition SLA Modelling Distributed execution Activity placement Transactional movement Filter aggregation
B B Matching Engine & B Routing Table Output queue d2 Input queue Notification Notification subscription dest Class = Loan d2 B Service RTT > 1s d3 B Output queue d3 Content-based routing Content-based Publish/Subscribe 3. Publish Publisher Publisher Service = credit check Publications 1. Advertise class = Loan, status = approved, Service RTT = 2s amount = $500K class = Loan Subscriptions Service RTT > 1s 2. Subscribe Subscriber Subscriber
Agenda Transactional movement Low overhead Without interrupting process Distributed process execution Process decomposition Event-based coordination Event-based service discovery Four discovery models Exploit similarity
A movement operation relocates a client (including its state) Client Container 1 Client Container 7 MOVE (to Broker 7) Client A Client A B1 B7 ∙∙∙ B2 B8 Client B • Movement may fail because target broker rejects it, etc. • During movement, there may be multiple copies of a client, but only one should be “running” at any time • Client container helps coordinate the movement
ACID-like transactional properties for various layers Atomicity Consistency Isolation Client layer A moving client must be exclusively either at its source or target broker. There must be at most one running instance of each client. Only the initial or final states of a moving client should be observable. Notifications layer Notifications are delivered exactly once to either the source or target client. A moving client (whether successful or not) should receive the same set of notifications as one that did not move. The set of notifications delivered to stationary clients from a moving publisher should be independent of whether the publisher successfully moves. Routing layer Either all or none of the set of routing table updates required for an operation (adv, sub, etc.) should all be applied. Each broker should have the minimal set of routing table entries for a set of advs and subs. A movement should only modify those routing table entries associated with the moving client. • Durability is omitted but can be achieved by persisting all state to stable storage • Refer to ICDCS 2009 paper for formal definitions
Client and coordinator states at the source and target • Coordinators are based on the three-phase commit protocol • The failure and timeout transitions are omitted for brevity
The global reachable state graph can be used to prove some of the transactional properties • E.g., in the commit state, the source client is clean, and the target client is started • Refer to ICDCS 2009 paper for proofs
Movement can trigger burst of covered messages A B C P P B S P B1 Bi A′ Bl Bj A A A′ C B2 S S S
The reconfiguration protocol is much faster than the covering protocol • Movement of “root” subscriptions is more expensive in the covering protocol
The reconfiguration protocol scales with the number of moving clients • The reconfiguration protocol achieves better movement latency despite more total messages because it is less bursty
Summary of transactional movement • ACID-like transactional properties provide well-defined guarantees for the movement of clients • Properties are modularized to simplify reasoning and implementation • Client layer movement and routing layer hop-by-hop reconfiguration protocols were developed • Evaluations show proposed protocol is more efficient and stable with respect to various parameters • End-to-end movement using covering negatively affects performance
Agenda Transactional movement Low overhead Without interrupting process Distributed process execution Process decomposition Event-based coordination Event-based service discovery Four discovery models Exploit similarity
Coordinating process control flow – atomic subscription activity2 publication: class = ACTIVITY_STATUSprocess = ‘Process1’activityName = ‘activity2’instanceId = ‘g001’status = ‘SUCCESS’ Process 1 activity1 flow1 activity2 activity5 activity3 subscription: class = ACTIVITY_STATUSprocess = ‘Process1’activityName = ‘activity2’instanceId= *status = ‘SUCCESS’ activity3 activity6 activity4 activity7 activity8
Coordinating process control flow – composite subscription activity6 subscription: class = ACTIVITY_STATUS,process = ‘Process1’,activityName = ‘activity5’,instanceId = $X,status = ‘SUCCESS’ AND class = LINK_STATUS,process = ‘Process1’,activityName = ‘activity2’,instanceId = $X,status = * Process 1 activity1 flow1 activity2 activity5 activity3 activity6 activity4 activity7 activity8
BPEL transformation example Sub1: [class,eq,ACTIVITY_STATUS], [process,eq,’Process5’], [activityName,eq,’activity1’], [IID,isPresent,any], [status,eq,’SUCCESS’] Pub3: [class,LINK_STATUS], [process,’Process5’], [activityName,’actitiy2’], [IID,’g001’], [status,’POSITIVE’] Sub5: [class,eq,ACTIVITY_STATUS], [process,eq,’Process5’], [activityName,eq,’activity4’], [IID,isPresent,any], [status,eq,’SUCCESS’] && [class,eq,ACTIVITY_STATUS], [process,eq,’Process5’], [activityName,eq,’activity7’], [IID,isPresent,any], [status,eq,”SUCCESS”] Process 1 activity1 Pub1: [class, ACTIVITY_STATUS], [process,’Process5’], [activityName,’flow1’], [IID,’g001’], [status,’STARTED’] Sub4: [class,eq,ACTIVITY_STATUS], [process,eq,’Process5’], [activityName,eq,’activity2’], [IID,eq,$X], [status,eq,’SUCCESS’] && [class,eq,LINK_STATUS], [process,eq,’Process5’], [activityName,eq,’activity2’], [IID,eq,$X], [status,isPresent,any] flow1 activity2 activity5 Sub2: [class,eq,ACTIVITY_STATUS], [process,eq,’Process5’], [activityName,eq,’flow1’], [IID,isPresent,any], [status,eq,’STARTED’] Pub5: [class, ACTIVITY_STATUS], [process,’Process5’], [activityName,’actitiy7’], [IID,’g001’], [status,’SUCCESS’] activity3 activity6 activity4 activity7 activity8 Pub2: [class, ACTIVITY_STATUS], [process,’Process5’], [activityName,’actitiy2’], [IID,’g001’], [status,’SUCCESS’] Pub4: [class, ACTIVITY_STATUS], [process,’Process5’], [activityName,’actitiy6’], [IID,’g001’], [status,’SUCCESS’] See ACM Trans. Web 2010 paper for full BPEL mapping
Distributed process execution BPEL Process Receive END WS Gateway Agent Assign Flow Invoke Wait Web Service client Reply PADRES ESB 6 Invoke 1 3 4 Web Service 2 5 Receive Assign Reply Wait pub/sub http/soap
Summary of distributed process execution engine • Many large processes are inherently distributed • Multiple partners, many administrative domains, geographically dispersed • Distributed architecture offers better scalability for large and highly concurrent processes • Provides resource allocation flexibility
Agenda Transactional movementLow overhead Without interrupting process Distributed process execution Process decomposition Event-based coordination Event-based service discovery Four discovery models Exploit similarity
Supported models Discoveries One-timediscovery Continuousdiscovery Static(e.g., find weather service) Static continuous(e.g., monitorreal estate) Staticresource Resources Dynamic(e.g., find micro-generation power) Dynamic continuous(e.g., monitor grid resources) Dynamicresource Event-based
Architecture Advertise all attributes: system = linux memory <= 2000 disk <= 320 Resource providers act as publishers Publish updates of dynamic attributes: memory = 1500 disk = 80 Distributed Content-Based Publish/Subscribe B4 B1 B3 B5 B2 Subscribe for resources: system = linux disk >= 200 Discovery clients act as subscribers 29
Static model Publication Subscription: memory > 1 Advertisement: system = linux, memory = 2, disk = 320 B4 B1 B3 B5 B2 Discovery is performed locally by any single broker. 30
Dynamic continuous model Subscription: memory > 1 Publication: memory = 1, disk = 200 Advertisement: system= linux, memory <= 2, disk < 320 B4 B1 B3 B5 B2 Traditional pub/sub routing of messages. Discovery subscription is routed to and stored at matching resource host brokers. 31
Similarity forwarding Sub2: memory > 2 Adv: system= linux, memory <= 2, disk < 320 Sub1: memory > 1 Pub: memory = 1, disk = 200 B1 B4 B3 B5 B2 To retrieve old results: Send covered sub to the covering sub’s discovery host broker. To intercept new results: Store covered sub at the first broker with a covering sub. 32
Discovery time 3.0 Normal(B) Similarity(B) 2.5 Normal(U) Similarity(U) 2.0 1.5 Avg discovery time (s) 1.0 0.5 0.0 10 20 30 40 50 60 70 80 90 100 Discovery similarity (%) Balanced spatial distribution of discoveries 3.0 Normal(B) Similarity(B) 2.5 Normal(U) Similarity(U) 2.0 Avg discovery time (s) 1.5 1.0 0.5 0.0 10 20 30 40 50 60 70 80 90 100 Discovery similarity (%) Clustered spatial distribution of discoveries • Similarity forwarding optimization is faster • Increased discovery similarity • Normal algorithm suffers • More matching resources are found • Optimized algorithm benefits • Reuse results • Spatial clustering of resources • Normal algorithm benefits • Smaller subscription propagation tree (more “multicast”) • Optimized algorithm benefits slightly • Results are often retrieved from discovery host broker • Spatial clustering of discoveries • Normal algorithm suffers • Congestion of messages near discovery host brokers • Optimized algorithm suffers slightly • Matching of cached results is relatively cheap
Summary of service discovery • Distributed event-based resource discovery framework offers: • Parallel discovery of static resources • Efficient dissemination of dynamic resource attributes • Real-time discovery of new resources • Similarity optimization reuses results of overlapping • Exploits publish/subscribe covering techniques • Benefits from more skewed spatial and interest distributions • The distributed architecture achieves faster discovery at the expense of increased network traffic
Distributed process execution stack Service discovery Service composition SLA Modelling Distributed execution Activity placement Transactional movement Filter aggregation
Summary • Dynamic publish/subscribe clients can move and change interests quickly and cheaply • Fast movement primitive with transactional properties • Congestion avoidance with incremental filter aggregation • Distributed process execution offers more flexible deployment options • Distributed architecture with event-based coordination • Dynamic utility-aware process redeployment • Service management can operate on distributed set of service registries • Distributed service discovery including continuous discovery model • Distributed service composition based on publish/subscribe matching
Future research directions • Consider impact of contention among shared resources • New service impacts existing applications • New application impacts existing applications • Isolate application performance • Develop a hybrid process execution architecture • Allow for both replication and distribution • Support process, activity, instance granularities • Tightly integrate service discovery and composition • Support continuous composition based on continuous discovery results • Allow automatic service binding at runtime based on continuous discovery results