440 likes | 562 Views
Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems. Peter Kacsuk MTA SZTAKI kacsuk@sztaki.hu. 1. SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI-283481. Motivations 1.
E N D
Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems Peter Kacsuk MTA SZTAKI kacsuk@sztaki.hu 1 SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI-283481
Motivations 1 In many cases large simulations are organized as scientific workflows that run on Distributed Computing Infrastructures (DCIs) However, there are too many different WF formalism WF languages WF engines If a community selected a WF system it is locked into this system: They can not share their WFs with other communities (even in the same scientific field) They can not utilize WFs developed by other communities
Motivations 2 A WF system engine is typically connected to one particular DCI (Distributed Computing Infrastructure) As a result, if a community selected a WF system it is locked into this DCI Porting the WF to another DCI requires extra effort Parallel execution of the same WF in several DCIs is usually not possible
What do we want to achieve? Cyberspace er Workflows Bio1 BioN Bio2 Kepler Taverna Galaxy WF systems Amazon XSEDE BOINC Infrastructures Users should be able to access and use any WF and any infrastructure in an interoperable way no matter which is their home WF system
What does WF interoperability mean? • If a user developed WF Y in WF system B she (or other users) should be able to • Re-use WF Y as part of another WF (e.g. WF X) developed in WF system A (Coarse-grained interoperability – CGI)
Coarse-grained interoperability CGI Coarse-grained interoperability (CGI) = Nesting of different workflow systems to achieve interoperability of WF execution frameworks DCI 1 A DCI 2 Y DCI 3
Features of CGI approach • Advantages • No restrictions on the embedded WFs • You can run the embedded WFs in their native DCI (even in parallel -> easy to achieve high degree of DCI parallelism) • Easy to implement and connect a new WF system • Drawbacks • Black-box approach: you cannot • modify the embedded workflow • control and observe the internal operation of the embedded WF
What does WF interoperability mean? • If a user developed WF Y in WF system B she (or other users) should be able to • Run WF Y under another WF system (e.g. WF system A) (Fine-grained interoperability – FGI) • Further develop WF Y under another WF system (e.g. WF system A) (FGI)
Fine-grained interoperability (FGI) or white box WF interoperability: Enables to transform one WF to another WF system and further develop it in the new system Interoperable Workflow Intermediate Representation (IWIR) Transform IWIR intoWF X Transform WF Y intoIWIR WFX WFY
Features of FGI approach • Advantages • White-box approach: you can • modify the embedded workflow • control and observe the internal operation of the embedded WF • Drawbacks • There are some restrictions on the WFs that can be transformed • You can run the embedded WFs only in the native DCIs of the target WF system • Not easy to implement and connect a new WF system
What does infrastructure interoperability mean? Cloud1 Cloud N • If a user developed WF X in WF system A she (or other users) should be able to • Run WF X in any DCI without significant porting effort • Run different nodes of WF X in different DCIs (if these nodes are in parallel branches then they can simultaneously run in different DCIs)
EU Projects that develop solutions for these goals • SHIWA • To solve WF and DCI interoperability issues • Duration: 2 years (July 2010 – June 2012) • SCI-BUS • To provide the required gateway technology • Duration: 3 years (Oct 2011 – Sep 2014)
Using a portal/desktopto parameterize and run these applications, and to further develop them Supercomputer based SGs (DEISA, TeraGrid) Access to a large set of ready-to-run scientific WF applications accessing a large set of various DCIs to make these WF applications run WF App. Repository Portal Cluster based service grids (SGs) (EGEE, OSG, etc.) Desktop grids (DGs) (BOINC, Condor, etc.) Clouds Grid systems Local clusters Supercomputers E-science infrastructure What does a WF developer need?
Supercomputer based SGs (DEISA, TeraGrid) SHIWA App. Repository Cluster based service grids (SGs) (EGEE, OSG, etc.) SHIWA Portal Desktop grids (DGs) (BOINC, Condor, etc.) Clouds Grid systems Local clusters Supercomputers Reference production infrastructure of SHIWA • Publish WF applications in a repository to be continued/used by other appl. Developers • Use the portal/desktop to develop complex applications (executable on various DCIs) based on WFs stored in the repository Application developers
facilitates publishingand sharingworkflows Supports: Abstract workflows with multiple implementations of 10 workflow systems Storing execution specific data Available: from the SHIWA Portal standalone service at: repo.shiwa-workflow.eu SHIWA Repository
SHIWA Bundle and SHIWA Desktop for WF interoperability SHIWA App. Repository WS-PGRADE MOTEUR SHIWA Desktop SHIWA Desktop SHIWA Bundle Triana ASKALON SHIWA Desktop SHIWA Desktop
SHIWA Bundle and SHIWA Desktop • SHIWA Bundle: • object (stored as a zip file) containing everything needed to expose a workflow for use • Provides a common language/format for workflow engines • Workflows are stored as SHIWA bundle in the SHIWA Repository • SHIWA Desktop connects a user’s desktop workflow environment to the SHIWA Repository
Extension of WF interoperability with DCI interoperability Supercomputer based SGs Cluster based service grids (SGs) Desktop grids (DGs) (BOINC, Condor, etc.) Cloud Grid systems Supercomputers Local clusters SHIWA App. Repository WS-PGRADE MOTEUR SHIWA Desktop SHIWA Desktop SHIWA Bundle BES interface DCI Bridge Triana ASKALON SHIWA Desktop SHIWA Desktop
workflow for DCI B Accessing DCI Bridge jobs in non-JSDL J1 • BES requires JSDL for job submission • Therefore we need a JSDL generator to help WF engines to create the JSDL for the jobs generated for WF nodes J2 J3 jobs in JSDL J4 J1 Production service J2 J3 J4 Workflow Engine JSDL Translator DCI n DCI Bridge DCI 1
Extension of WF interoperability with DCI interoperability (2) Supercomputer based SGs Cluster based service grids (SGs) Desktop grids (DGs) (BOINC, Condor, etc.) Cloud Grid systems Supercomputers Local clusters SHIWA App. Repository WS-PGRADE MOTEUR JSDL Translator JSDL Translator SHIWA Desktop SHIWA Desktop SHIWA Bundle BES interface DCI Bridge Triana ASKALON SHIWA Desktop SHIWA Desktop
Where are we? • Workflow interoperability done by • SHIWA Bundle • SHIWA Desktop • SHIWA Repository • DCI interoperability done • DCI Bridge • JSDL Translator • All of them are production services • What else do we need? • A reference service through which anyone can try the technology • The reference service is the SHIWA portal
SHIWA portal: WS-PGRADE/gUSE Generic-purpose gateway framework • Based on Liferay • General purpose • Workflow-oriented portal framework • Supports the development and execution of workflow-based applications • Enables the multi-cloud, multi-DCI execution of any WF • Provides access to • internal repository • external SHIWA Repository
Creating and running WS-PGRADE workflows Step 1: Edit workflow
Step 2: Configuring the workflow Cloud1 Cloud N
Seamless access to various types of DCIs WEB-UI(HTML) BOINC plugin Unicoreplugin Cloud plugin Gliteplugin GT5 plugin ARC plugin DCIs Client machine Portal Server machine BES interface WF Interpreter WF Storage Liferay WS-PGRADE portal ARC Grid DCI-Bridge GT5 Grid File Storage WF Graph editor WF Repository BOINC Grid Cloud Broker Information System
WFs in the clouds • This issue is solved by the SCI-BUS project by integrating WS-PGRADE/gUSE with CloudBroker Platform • Motivation: • Cloud resources are getting more and more popular • Clouds are more reliable than grids • WFs with cloud access are capable of satisfying compute needs of complex scientific computations • Clouds can provide a vast amount of resources • Aim: • Provide access to cloud resources in a transparent way
Integrated WS-PGRADE/CloudBroker Platform to access multi-clouds Multi-cloud WS-PGRADE 1 IaaS Cloud 1 SEQ Cloud Broker Platform WS-PGRADE n IaaS Cloud N SEQ • Supported clouds: Amazon, IBM, OpenStack, Eucalyptus, OpenNebula • SaaS solution: • Preregistered services/jobs can run from WS-PGRADE (Supported from gUSE 3.5.0) • IaaS solution: • any services/jobs can be submitted from WS-PGRADE (Supported from gUSE 3.5.1)
CloudBroker Platform • Web-based application repository for the deployment and execution of scientific and technical software in the cloud • Offers these stored applications as SaaSservice for end users • On demand, pay per use, browser / programmatic / command-line access, cross-domain • Uses infrastructure as a service (IaaS) from resource providers and offers these IaaS resources for users
CloudBroker Platform Architecture End Users, Software Vendors, Resource Providers User Tools CLI WebBrowserUI Java Client Library REST Web Service API CloudBroker Platform CloudBroker Integration HealthAppli-cations EngineeringAppli-cations ChemistryAppli-cations BiologyAppli-cations …Appli-cations IBMCloud Open-StackCloud AmazonCloud Euca-lyptusCloud …Cloud 33
Integration features Support for commercial clouds with costs (prices configured in CloudBroker Platform): Estimated job cost before submission Actual job and workflow cost after execution
Accessible Cloud Resources Access provided by the CloudBroker Platform Commercial: Amazon EC2 IBM OpenSource/Free: OpenStack OpenNebula Eucalyptus Currently, within SCI-BUS accessible: MTA SZTAKI OpenNebula (400 cores) BIFI OpenStack (50 cores)
gUSE Portal gUSEPortal SHIWA Repository gUSE WF Repo gUSE WF Repo Collaboration within and among communities based on gUSE WF upload as SHIWA bundle WF upload as SHIWA bundle Cloud 1 OpenNebula Cloud 2 Amazon Cloud n OpenStack 37
Success story: SHIWA solution for the LINGA experiment Multi- Workflow Management Sub-Workflows
Maturity of implementation • Production services: • SHIWA Repository • SHIWA Bundle and SHIWA Desktop • CGI approach - Connected WF systems: • ASKALON, Galaxy, MOTEUR, Pegasus, Taverna, Triana, WS-PGRADE • SHIWA portal based on gUSE • CloudBroker Platform - Connected clouds: • Amazon, IBM, Eucalyptus, OpenNebula, OpenStack • Prototype services • FGI approach - Connected WF systems: • ASKALON, MOTEUR, Triana, WS-PGRADE • New EU project ER-Flow supports 6 user communities
Recent WS-PGRADE/gUSE releases History since v3.4.0 • Nov 2011: v3.4.0 (DCI Bridge) • Feb 2012: v3.4.1 (usage statistics portlet) • March 2012: v3.4.2 (support for new EMI release) • April 2012: v3.4.3 (support for Liferay 6.1) • … • Aug 2012: v3.5.0 (SaaS cloud access via CBP) • Sep 2012: v3.5.1 (IaaS cloud access via CBP) • Oct 2012: v3.5.2 (SHIWA workflow repository export/import) • March 2013: v3.5.3 (REST support, EMI-UI v1/v2 support, …) • April 2013: v3.5.4 (cloud cost estimation/reporting) • April 2013: v.3.5.5 (robot certificates) • May 2013: v.3.5.6 (Improved SHIWA workflow repository export/import)
Where to find further information? • gUSE/WS-PGRADE: • http://www.guse.hu/ • gUSE on sourceforge • http://sourceforge.net/projects/guse/ • http://sourceforge.net/projects/guse/forums/forum/ • http://sourceforge.net/projects/guse/develop • SCI-BUS web page: • http://www.sci-bus.eu/ • SHIWA web page: • http://www.shiwa-workflow.eu/ • ER-Flow web page: • http://www.erflow.eu
Summary Kepler WF gUSE WF system OpenNebulaCloud Galaxy WF We have created a technology that enables to combine many different WFs, WF systems and DCIs in many different ways It is like a puzzle where you can put together the required pieces to create the final picture