140 likes | 154 Views
Building an E uropean R esearch Community through Interoperable Work flow s and Data ER-flow project Gabor Terstyanszky, University of Westminster, UK EGI Technical Forum 17-21 September 2012. ER-flow is supported by the FP7 Capacities Programme under contract No. RI-261585.
E N D
Building an European Research Community through Interoperable Workflows and DataER-flow projectGabor Terstyanszky, University of Westminster, UKEGI Technical Forum17-21 September 2012 ER-flow is supported by the FP7 Capacities Programme under contract No. RI-261585
ER-flow Project Partners: Technology providers: CNRS, EGI.eu, MTA-SZTAKI, UoW Research Communities: Astro-Physics INAF Computational Chemistry LMU + TUD Helio-Physics TCD + UCL Life Science AMC Duration: September 2012 – August 2014
Project Aim and Services Aim: • To provide a simulation platform for research communities to enable seamless execution of workflows of different workflow systems through workflow interoperability • To investigate data interoperability issues in the workflow domain and propose solutions Services: • To support the whole workflow lifecycle: editing, uploading, browsing downloading and executing workflows • To provide coarse-grained workflow interoperability solution • To provide GUIs to manage workflows Key actors: • researchers workflow engine developers workflow developers
Project Objectives Objective No. 1: To further build a European community of workflow developers and users involving a wide range of research communities which already use workflow systems and which are new to this technology. Objective No. 2: To migrate workflow based scientific applications of the supported research communities to the European Grid Infrastructure through the SHIWA Simulation Platform and to use these applications both for production runs and to promote e-Science workflow solutions for research communities. Objective No. 3: To disseminate the workflow interoperability solution of the SHIWA project among the selected research communities and identify further research communities that need the simulation platform to run their experiments. Objective No. 4: To define requirements of the supported research communities on interoperability of the scientific data in the workflow domain and to identify existing and missing protocols and standards needed to support this interoperability. Objective No. 5: To write a study on the interoperability of the scientific data in the workflow domain, make recommendations on how to achieve data and workflow interoperability with existing protocols and standard, and identify research, development and standardisation issues required to be solved in order to achieve workflow interoperability in data-intensive research. 4
Coarse-Grained Interoperability: submitting non-native workflow Workflow of Workflow Engine B Workflow Engine A submission client Submission Service Workflow Engine B Workflow Engine A WF DCI • non-native workflow: WF • - non-native workflows are black boxes which are managed as legacy code applications 5
Coarse-Grained Interoperability: meta-workflow submission Workflows of Workflow Engine A Workflow of Workflow Engine B J1 submission client Submission Service Workflow Engine B WF2 WF4 Workflow Engine A J3 DCI • native workflows: J1, J3 and WF2 • non-native workflows: WF4
Data Interoperabilityin Workflow Domain Meta-workflow • Challenges: • How to manage different data formats of different workflow systems? • Solution: • Virtual Data Object concept – abstract data presentation • How to transfer data among jobs and workflows? • Solution: • supporting standard data protocols to transfer data J1 ???? WF2 WF2 J3
CGI Infrastructure DCIs SHIWA Science Gateway SHIWA Repository Unicore DCI ARC DCI SHIWA Portal WF1 WFn GEMLCA admin gLite DCI Globus DCI GEMLCA Repository WS-PGRADE Workflow editor WF1 WFm ASKALON WE Galaxy WE Triana WE WE1 WEp Taverna WE WS-PGRADE Workflow engine GWES WE Kepler WE GEMLCA with GIB Pegasus WE MOTEUR WE GEMLCA Service Proxy Server ProActive WE PGRADE WE SHIWA Proxy Server Workflow Engines • SHIWA Science Gateway Resources native WE WS-PGRADE local resources: invocation of locally deployed WEs portal WS-PGRADE v3.5 WE submission to local cluster repository GEMLCA + SHIWA repo remote resources: through remotely pre-deployed submitter GEMLCA with GIB WEs to ARC, gLite, GlobusUnicore DCIs proxy management SHIWA Proxy Server
CGI Developer Scenario: Specifying Workflow Engine SHIWA Repository step 1specify WE data WF1 WFn SHIWA Science Gateway SHIWA Portal workflow engine developer GEMLCA admin GEMLCA Repository WS-PGRADE Workflow editor WF1 WFm WE1 WEp step 2upload WE binary, dependencies step 3deploy WE WS-PGRADE Workflow engine Proxy Server GEMLCA with GIB SHIWA Proxy Server GEMLCA Service
CGI Developer Scenario: Specifying Workflows SHIWA Repository step 1specify WF data WF1 WFn SHIWA Science Gateway workflow developer step 3deploy WF SHIWA Portal WF1 WFm step 2upload WF WE1 WEp WS-PGRADE Workflow editor GEMLCA Repository Proxy Server GEMLCA with GIB WS-PGRADE Workflow engine SHIWA Proxy Server GEMLCA Service 10
CGI User Scenario: Native WE - PGRADE DCIs Unicore DCI gLite DCI ARC DCI Globus DCI SHIWA Science Gateway SHIWA Repository ASKALON WE Galaxy WE WF1 WFn Triana WE Taverna WE GWES WE step 1search WF Kepler WE step 3retrieve WF data Pegasus WE WF1 WFm MOTEUR WE SHIWA Portal WE1 WEp step 2 edit WF ProActive WE PGRADE WE GEMLCA Repository WS-PGRADE Workflow editor user step 5retrieve WF WE + WF Workflow Engines WF list WS-PGRADE Workflow engine GEMLCA with GIB step 7 run WF step 6retrieve proxy GEMLCA Service step 4submit WF Proxy Server SHIWA Proxy Server
CGI User Scenario: Native WE - MOTEUR DCIs Unicore DCI gLite DCI ARC DCI Globus DCI SHIWA Science Gateway SHIWA Repository ASKALON WE Galaxy WE Triana WE WF1 WFn Taverna WE GWES WE Kepler WE step 1search WF GEMLCA Repository Pegasus WE MOTEUR WE WF1 WFm WE1 WEp user ProActive WE PGRADE WE Workflow Engines step 4retrieve WF WE + WF step 2 edit WF GEMLCA with GIB Proxy Server step 6run WF SHIWA Proxy Server GEMLCA Service MOTEUR Workflow editor step 5retrieve proxy step 3 submit WF MOTEURWorkflow Engine GEMLCA Client GEMLCA UI 12
ER-flow Services Technical support: • SHIWA Simulation Platform • workflow repository - uploading new workflows - browsing, searching, selecting and downloading workflows • portal - creating, editing, submitting and monitoring workflows Application support • training e-scientists to use the simulation platform • porting workflows to the simulation platform • helping e-scientists to run workflows on the simulation platform 13
ER-flow & Research Communities Supported research communities: • Astro-Physics 14 workflows • Computational Chemistry 20 workflows • Helio-Physics 14 workflows • Life Science 20 workflows Supported research communities: • number of users: minimum: 250 • Number of executed workflow: 3000 Further research communities: • at least four more research communities • Hydrometeorology • ???