1 / 17

GLOBUS PLUG-IN FOR WINGS WOKFLOW ENGINE

GLOBUS PLUG-IN FOR WINGS WOKFLOW ENGINE. Elizabeth Martí ITACA Universidad Politécnica de Valencia emarti@itaca.upv.es. Introduction. Take advantage of two concepts: Workflow & Grid. Workflow provides the automation of the processes.

pete
Download Presentation

GLOBUS PLUG-IN FOR WINGS WOKFLOW ENGINE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GLOBUS PLUG-IN FOR WINGSWOKFLOW ENGINE Elizabeth Martí ITACA Universidad Politécnica de Valencia emarti@itaca.upv.es

  2. Introduction • Take advantage of two concepts: Workflow & Grid. • Workflow provides the automation of the processes. • Grid makes possible the development of high-performance computing systems using heterogeneous geographically distributed resources with multiple administrative domains. • A Grid workflow can be defined as the composition of grid application services which execute on heterogeneous and distributed resources in a well-defined order to accomplish a specific goal (Rajkumar Buyya).

  3. Motivation • There have appeared many different workflow initiatives. • Askalon, Karajan, Kepler , K-WfGrid, Taverna, Triana, etc. • They lack of some important characteristics: • multi grid capability. • easy extensibility to new middleware. • etc. • WINGS provides new features focusing on high level definition, multigrid and extensibility capabilities. • The most significant features of WINGS are: • Expressiveness to capture specificities of grid computing. • Provide flow control structures. • Consider simple light operations. • It is able to deal with different grid middlewares and versions.

  4. WINGS Concepts • It is based on four concepts to model a workflow: • Data sources: Communication points to interchange data among the different executions of the workflow. • Activities: Abstractions of tasks to be run on the Grid. Describe the functionality of the tasks. • Are defined by: • The input and output parameters (simple/structured types). • The list of deployments that provides the multi-grid middlewares specifics. • Executions: Specific instances of an activity. The engine is in charge of selecting from the different deployments defined for each activity, according to where it going to be run. • Operations: Simple executions that will be executed by the workflow runtime in order to pre or post process the information available in the Data Sources, to be used by the next tasks. • Examples: arithmetic and reduction operations, string search operations, field extractions operations, split or merge file operations etc.

  5. WingsEngine • It considers a pure data flow language where a workflow is a sequence of: • DS – Execution or Operation – DS • Simplifies the workflow description and understanding, and also increases the expressiveness. • It is in charge of providing the functionality defined in the XML file, creating a environment to launch concurrent jobs. • A key issue in a multi-grid environment is the movement of the files among the different resources of consecutive tasks, so the RT tries to: • Reduce the number of data transferences. • Deal with different physical file storage systems .

  6. WINGS Architecture Middlewares Engines Fura GT2 etc … Operations Arithmetic Split File WINGS Core Engine etc … Information Systems Fura RM MDS etc … Transference Systems Fura IXOS GridFTP etc …

  7. Executionscheme • Core Engine: • Performs the logic and control operations: • Prepare and select the tasks ready to be launched and the data to use in each execution. • Plug-ins: • In charge of effectively perform the file transferences and all the needed operations to complete the execution.

  8. Middleware Plugins • Extended functionality just implementing a plug-in and adding it to the system. • In the first version of the workflow engine, the Fura middleware plug-in was developed • Now a Globus Toolkit plug-in has been implemented to enable multi-grid tests. • Globus has been selected due to the great number of current infrastructures that use it as the underlying grid middleware (EGEE, EELA, etc.).

  9. Globus Plug-in • Step 1: To prepare theactivity. • Workflowmodelisdefined at the XML file. • Create a valid proxy (Proxy store). • Define theexecutionenviromentof thetask (Globus, Fura,…). • Create a workingdirectoryontheexecution host (GridFTP). • Createanexecutiondirectory(GridFTP). • Copytheexecutabletotheexecution host (ThirdpartycopywithUrlCopy). • Ifnecessarycopyauxiliary data usedbytheexecutable (libraries, jar files, …).

  10. GLOBUS PLUG-IN • Step 2 : To prepare the initial data. • Obtain the information of the input data (XML file) and store it (input parameters matrix). • Obtain the number of microtask (combination of inputs). • Create an input directory. • Copy the input data to the input directory (UrlCopy). • Create an output directory.

  11. GLOBUS PLUG-IN • Step 3 : Execute the task. • Define the RSL file for the task. • Executable, arguments, working directory, etc. • Create a GRAM Job for each RSL file. • Launch the job (batch mode). • Parallel execution of microtasks.

  12. GLOBUS PLUG-IN • Step 4 : Get output data. • Get output data from the output directory. • Use of wildcards to filter files. • Create a replica of results in a specified location. • Path specification at the data source definition. • Clean intermediate data. • Implementation of a function to delete recursively directories.

  13. Use Case • A biomedical application representing the execution of a medical images co-registration process (rigid and elastic). • The co-registration processes compare all the images with the base study to align the voxels of the studies to be as much as possible similar to the reference image. • The input data are dynamic series of 3D magnetic resonance images after the injection of a contrast bolus in the area of the abdomen, to study the perfusion of the liver. • The set are composed by 5 studies with 12 slices.

  14. Use Case Biomedical Application • The workflow is composed by three steps • Rigid co-registration • Elastic co-registration (the most CPU consuming step) • Process to transpose the N studies (with K slices) results of the co-registration into K studies with N slices.

  15. USE Case Execution Times • F1: AMD Opteron 2.4GHz with 1GB of RAM (Fura Agent) • F2: AMD Opteron 2.2GHz with 1GB of RAM (Fura Agent) • GN: Pentium Xeon 2 GHz with 512 MB of RAM (Globus Node) • Gigabit Ethernet Network

  16. Conclusions • Wehaveanalyzedpreviousworksand some of themhavegoodfeaturesbut do notfitourneeds. • WINGS has been designed in a modular way enabling to add new components to the system through a plug-in. • We have implemented a Globus plug-in oriented to GT middleware. • Currently Fura, Globus Toolkit (pre-ws services), and sub-workflow execution plugins have been developed enabling to launch cross-middleware tests with the two specified grid systems.

  17. Thanks for you attention !

More Related