270 likes | 447 Views
A Workflow Engine with Multi-Level Parallelism Supports. Qifeng Huang and Yan Huang School of Computer Science Cardiff University 200 5.9. Agenda. Background SWFL Workflow Architecture SWFL Description Language SWFL Workflow Engine Multi-level Parallelisms in SWFL.
E N D
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9
Agenda • Background • SWFL Workflow Architecture • SWFL Description Language • SWFL Workflow Engine • Multi-level Parallelisms in SWFL
Background: Service and Service Composition • Service encapsulates various resources and make them available over the network via standard interface and protocol • Web/grid services are emerging as important paradigms for distributed computing • Service composition/workflow: complex application can created by simple services
Background: GSiB • Current efforts such as BPEL mainly focus on business process • Increased demands for scientific workflow, as parallel computing especially grid computing applications expands • GSiB aims to a general workflow for both business and scientific areas, especially for the latter • The convergence trend of grid services and web services make it feasible
VSCE Service Workflow Language SWFL Workflow Engine GSiB Workflow Architecture • SWFL: an XML-based, graph-oriented service workflow description language • Engine: Distributed enactment environment with multi-level parallelism support • VSCE:Visual Service Composition Environment
FlowModel(name, isParallel, …) Types* Message* (name, part* …) Variables* (name, type) Activity* Definition of all involved activities (normal/native services, assign, if, switch, for, while, do while and catchEnd activities) FlowModel* (name, isParallel …) ControlLink* (Source/Port, Target/Port) DataLink* (Source/Part, Target/Part) SWFL: Basic Elements
SWFL: Graph-Oriented • In GSiB, a workflow application can be described either as a validated XML (SWFL) documentation or a directed graph • A node (activity in SWFL) could be either a standard service operation, an compound structure, or an on-machine program • An edge (data/control link in SWFL) describes the data and control dependencies among involved activities
Data Source IF(a/b) a>b Activity A Activity B Activity C Data Sink SWFL: An Example …… <swfl:flow name="sample" requireParallel="false"> <wsdl:input message="flowInput"/> <wsdl:output message="flowOutput"/> <swfl:activity> <swfl:if name="ifControl">…</swfl:if> </swfl:activity> <swfl:activity> <swfl:normal name="ActivityA"> <swfl:performedBy>… </swfl:performedBy> </swfl:normal> </swfl:activity> …… <swfl:controlLink> <swfl:source name="ifControl" port="IF"/> <swfl:target name="task2"/> </swfl:controlLink> …… <swfl:dataLink target="ifControl"> <swfl:source name="ActivityA"> <swfl:map>…</swfl:map> </swfl:source> </swfl:dataLink> …… </swfl:flow> ……
SWFL vs. BPEL • Both can be used to build workflows which involve peer-to-peer interactions between web services • BPEL is mainly for business processes while SWFL is mainly for scientific areas • BPEL uses a script-oriented approach, while SWFL follows a graph-oriented approach
SWFL: Why Graph-Oriented? • Easy to use, especially using friendly VSCE: Like flow chart and UML model • Flexible and dynamic in services schedule and execution • Completely decided by the engine • Making full use of dynamic runtime features, different strategies can be used for a flow • Straightforward support to multi-level parallelisms
VSCE: Make Complicated Things Easy Workflow Drawing Pane
VSCE: What is more… • Friendly integrated visual tool for users to build, execute and control workflow • Make end users not have to know much about workflow • Design (draw) a flow with fun: Drag-and-drop • Configure and initiate the execution • Retrieve results and track runtime status
A Grid Architecture Based on workflow Engines (1) Service SWFL Service Job Processor Job Processor Service Job Processor Workflow Engine Job Processor Service Job Processor Job Processor Service <invoke name="registerAuctionResults" partnerLink="auctionRegistrationService" portType="as:auctionRegistrationPT" operation="process" inputVariable="auctionData"> <correlations> <correlation set="auctionIdentification"/> </correlations> </invoke> <receive name="receiveAuctionRegistrationInformation" partnerLink="auctionRegistrationService" portType="as:auctionRegistrationAnswerPT" operation="answer" variable="auctionAnswerData"> <correlations> <correlation set="auctionIdentification"/> </correlations> </receive> Job Processor Service BPEL
A Grid Architecture Based on workflow Engines (2) Service Job Processor Service Job Processor SWFL Job Processor Service Job Processor Job Processor Service Job Processor Job Processor Workflow Engine Service Job Processor Service Job Processor Job Processor Workflow Engine Service Job Processor Job Processor Service Job Processor <invoke name="registerAuctionResults" partnerLink="auctionRegistrationService" portType="as:auctionRegistrationPT" operation="process" inputVariable="auctionData"> <correlations> <correlation set="auctionIdentification"/> </correlations> </invoke> <receive name="receiveAuctionRegistrationInformation" partnerLink="auctionRegistrationService" portType="as:auctionRegistrationAnswerPT" operation="answer" variable="auctionAnswerData"> <correlations> <correlation set="auctionIdentification"/> </correlations> </receive> Job Processor Workflow Engine Service Job Processor Service Job Processor Job Processor Service Job Processor Job Processor BPEL Service Job Processor Job Processor
Graph2XML SWFL/MPFL Document XML2Graph 1 Graph2Java 2 Result Enactment Environment Execution Java Programs 3 GSiB Workflow Processing
GSiB Instance: Graph Objects • XML2Graph and Graph2Java tools • Graph Objects • Two kinds: data graphs and control graphs • Straightforward format for VSCE • Schedule strategy is decided during runtime
VSCE Engine Gateway Job Processor Scheduler Storage UDDI Engine: Architecture
Engine: Components • Gateway: a web service provides entry point to submit jobs and retrieval results and runtime status: three job formats • Job Processor: computing resources composed of a pool of worker threads • Scheduler: provides dynamic service execution strategy during runtime • Storage: provides space as well as API for objects, results and status information
Engine: Multi-Level Parallelisms • Service-level • Flow-Level • Message-Passing • Parallelism in BPEL: explicitly described in the script
Service-Level Parallelism • An activity is ready when all its input data are ready and all activities it has control dependencies are complete • May exist several ready activities at the same time; Can be executed in parallel • Greedy algorithm: execute an activity once it is ready; may waste storage and computing resource; not always optimum • Question: how to schedule services?
Flow-Level Parallelism: An Example Process2 A A C B C B Partition E D E D F F Process1
Flow-Level Parallelism (2) • Decentralized orchestration of services: divide a workflow into several sub-flows, to run by several job processors in parallel • Two kinds: independent connected graphs; partition connected graph • Parallelism achievements: quick response; high throughput; scalability • Additional complexities: flow partition; coordination of distributed execution
Message-Passing Parallelism: Background and MPFL • Parallelism in SWFL is suitable for applications with forms of parallelism that can be displayed in a workflow graph • Most scientific applications exhibit more sophisticated parallelism like message passing, which is a normal thing • MPFL: extends the SWFL flow model to support applications with message-passing
Message-Passing Parallelism: An Example A Workflow Application B B B B C C C C D Process 0 Process 1 Process 2 Process N-1 Process 3 Flow Model 2 for processes with rank larger than 0 A MPFL Model B Flow Model 1 for process 0 C D
Message-Passing Parallelism Service • Multi-layer heterogeneous communication domains are supported • An instance is usually run on a cluster: parallelism just like a standard MPI program can be achieved • Engine: accumulative extension of SWFL engine; still a work in progress Service Job Processor Job Processor Service Job Processor Workflow Engine Job Processor MPFL Service Job Processor Job Processor Service Cluster Service
Conclusions • Workflow framework in GSiB is grid-oriented, suitable for both business and scientific applications composed of web/grid services • Graph-based SWFL provides much flexibilities for both end users and engine implementation • VSCE provides visual tool to build and execute workflow applications • SWFL engine provides an automatic and self-organizing enactment environment for the processing of workflow applications • Better performance is achieved with the support of multi-level parallelism in SWFL engine