850 likes | 1.01k Views
Workflow execution in EGEE VOs, P-GRADE portal for EGEE VOs. Peter Kacsuk MTA SZTAKI, Hungary Kacsuk@sztaki.hu. Technology concerns of Grid systems. Fast evolution of Grid systems and middleware: GT1, GT2, OGSA, OGSI, GT3, WSRF, GT4, …
E N D
Workflow execution in EGEE VOs, P-GRADE portal for EGEE VOs Peter Kacsuk MTA SZTAKI, Hungary Kacsuk@sztaki.hu
Technology concerns of Grid systems • Fast evolution of Grid systems and middleware: • GT1, GT2, OGSA, OGSI, GT3, WSRF, GT4, … • Many Grid systems are built based on these different technologies • EGEE (LCG-2, g-Lite), NorduGrid, UK NGS, Grid2003, etc.
User concerns of Grid systems • How to cope with the variety of these Grid systems? • How to develop/create new Grid applications? • How to execute Grid applications? • How to observe the application execution in the Grid? • How to tackle performance issues? • How to port legacy applications • to Grid systems • between Grid systems? • How to execute Grid applications over several Grids in a transparent way?
What does the user want? • Fast development of Grid workflow application • Various workflow components: • Sequential code • MPI code • Legacy code • Parallel execution of workflow components in the Grid • Simultaneous execution of workflow components in different Grids • Monitored execution of a workflow in the Grid • To port applications among Grids Portal technology could be an answer for the users
Properties of the P-GRADE Portal • General purpose, graphical, workflow-oriented Grid portal to support development and execution of workflow-oriented Grid applications • Supported services – functionalities • MyProxy – proxy credential management • GT2/GT3 GRAM – job execution • Mercury – job/resource monitoring • PROVE – workflow/job execution visualization • BDII and MDS-2 - information system access • Multi-Grid portal • GridSphere based • Easy to expand with new portlets • Easy to tailor to end-user needs
Workflow support in the P-GRADE Portal • Contains • built-in workflow editor and • workflow execution manager • Components of the workflow can be • Job to be executed as a Condor job in GT2/GT3 Grids (like LCG-2, GridLab, etc.) • GEMLCA service to be executed in a GT-3 and GT-4 Grids (UK OGSA test-bed project) • Portal release, tutorial and demo are available at: http://www.lpds.sztaki.hu/pgportal
What is a workflow? • The workflow is a graph where • Nodes are jobs (or services) • Arcs represent file transfer between the jobs (services) • Semantics of the workflow: • Job can be executed if all the necessary file transfers represented by the arcs are completed
Two level parallelism by a workflow • The workflow concept enables the efficient solution of complex problems in a distributed environment like Grid • Semantics of the workflow enables two levels of parallelism: • Parallel execution inside a workflow node • Parallel execution among workflow nodes Jobs can be parallel
Ultra-short range weather forecast (Hungarian Meteorology Service) Forecasting dangerous weather situations (storms, fog, etc.),crucial task in the protection of life and property 25 x Processed information: surface level measurements, high-altitude measurements, radar, satellite, lightning, results of previous computed models 10 x 5 x 25 x • Requirements: • Execution time < 10 min • High resolution (1km)
Workflows developed in P-GRADE can be used in the P-GRADE portal • They can be • Submitted to the Grid by the portal • Modified by the workflow editor of the portal • P-GRADE is the development environment and the portal is the Grid execution environment Why is it called P-GRADE Portal? • P-GRADE is a Parallel Grid Run-time and Application Development Environment • Containing workflow editor and execution support • The P-GRADE workflow editor is compatible with the workflow editor of the portal
Principles of the P-GRADE portal Certificate server CERTIFICATE (download) Remote Clusters to be controlled WORKFLOW MANAGER (submit) CERTIFICATE (upload) EDITOR (save|upload) Portal server Workflow (result) EDITOR (open) WORKFLOW MANAGER (output)
P-GRADE Portal Login • Current release is 2.1 • A username and a password are required for login • Apply them at the service provider: • SZTAKI for HunGrid and SEE-Grid • Univ. of Westminster for UK NGS
Principles of the P-GRADE portal Certificate server CERTIFICATE (download) Remote Clusters to be controlled WORKFLOW MANAGER (submit) CERTIFICATE (upload) EDITOR (save|upload) Portal server Workflow (result) EDITOR (open) WORKFLOW MANAGER (output)
Certificate Manager(certificates portlet) • The certificates could be obtained from the MyProxy server using the “Download” button after entering the certificate and server details • Login and proxy downloading require different usernames and passwords • To submit jobs users should have valid certificates • The Certificate Manager handles users’ certificates uploading, downloading, displaying & allocating them • Downloaded certificates have to be allocated to a Grid using the “Set for Grid” button • This operation displays the details of the certificate and the list of available Grids • Different Grids (eg. SEEGRID and HunGrid) require different user certificates • The Certificate Manager handles different certificates of different Grids • There is a possibility to displaydetails of certificates using the “Details” button
Resource Manager(settings portlet) • Resource Manager helps to add & delete resources of different Grids • To add a new resource to a Grid, users have to select it using the “Resources”button • The portal displays the list of available resources of the selected Grid • Users can add resources from the list using the “Load default” button or edit them manually
User concerns of Grid systems • How to cope with the variety of these Grid systems? • How to develop/create new Grid applications? • How to execute Grid applications? • How to observe the application execution in the Grid? • How to tackle performance issues? • How to port legacy applications • to Grid systems • between Grid systems? • How to execute Grid applications over several Grids in a transparent way?
Main interactions 2. Certificate server Remote Clusters to be controlled WORKFLOW MANAGER (submit) CERTIFICATE (download) CERTIFICATE (upload) EDITOR (save / upload) Portal server Workflow (result) EDITOR (open) WORKFLOW MANAGER (output)
Workflow Editor • Workflow is a directed graph • Every node is a job • The arrows represent the file transfers among the jobs (and sites)
- All jobs have properties window • This window contains the most important information about the job eg. Type, name, required process number, Grid and resource name for execution Workflow Editor
Monitoring System(information system portlet) • The portal uses MDS-2 and LCG-2 information systems • Users can select a Grid and a Virtual Organization to be displayed
Monitoring System(information system portlet) • Detailed view of the selected Grid
Every job has ports • The box in Green color represents the input port • Every port has a properties window to define the special settings Workflow Editor
Job i Job j Job k Remote input files are copied upon submitof Job i (Stepi2) Executing Resources (CE) Locations of remote files (SE) Remote output files are copied upon terminationof job i(Stepi3) The eventual communication between the user desktop and the locations of remote file(s) is outside of the scope of the Portal. (Step 0,Step 4) Local output files are copied upon terminationof job i(Stepi3) Local output files may be downloaded upon the termination of the wholeworkflow (Step 4) The executable and the local input files are copied upon submit of job i (Step i2) P-Grade Portal Server Local input files, job executables, and the workflow graph uploaded by the Workflow Editor (Step 1) User Desktop
Job i Job j Job k Remote input files are copied upon submitof Job i (Stepi2) Executing Resources (CE) Locations of remote files (SE) Remote output files are copied upon terminationof job i(Stepi3) The eventual communication between the user desktop and the locations of remote file(s) is outside of the scope of the Portal. (Step 0,Step 4) Local output files are copied upon terminationof job i(Stepi3) Local output files may be downloaded upon the termination of the wholeworkflow (Step 4) The executable and the local input files are copied upon submit of job i (Step i2) P-Grade Portal Server Local input files, job executables, and the workflow graph uploaded by the Workflow Editor (Step 1) User Desktop
Job i Job j Job k Remote input files are copied upon submitof Job i (Stepi2) Executing Resources (CE) Locations of remote files (SE) Remote output files are copied upon terminationof job i(Stepi3) The eventual communication between the user desktop and the locations of remote file(s) is outside of the scope of the Portal. (Step 0,Step 4) Local output files are copied upon terminationof job i(Stepi3) Local output files may be downloaded upon the termination of the wholeworkflow (Step 4) The executable and the local input files are copied upon submit of job i (Step i2) P-Grade Portal Server Local input files, job executables, and the workflow graph uploaded by the Workflow Editor (Step 1) User Desktop
- The box in Grey color represents the output port • Every port has a properties window to define the special settings Workflow Editor
Workflow editor The workflow pull-down menu enables saving and later opening the prepared workflows
User concerns of Grid systems • How to cope with the variety of these Grid systems? • How to develop new Grid applications? • How to execute Grid applications? • How to observe the application execution in the Grid? • How to tackle performance issues? • How to port legacy applications • to Grid systems • between Grid systems? • How to execute Grid applications over several Grids in a transparent way?
Main interactions 3. Certificate server Remote Clusters to be controlled WORKFLOW MANAGER (submit) CERTIFICATE (download) CERTIFICATE (upload) EDITOR (save|upload) Portal server Workflow (result) EDITOR (open) WORKFLOW MANAGER (output)
Workflow Management(workflow portlet) • The portlet presents the status, size and output of the available workflow in the “Workflow” list • It has a Quota manager to control the users’ storage space on the server • The portlet also contains the “Abort”, “Attach”, “Details”, “Delete” and “Delete all” buttons to handle execution of workflows • The “Attach” button opens the workflow in the Workflow Editor • The “Details” button gives an overview about the selected workflow
Workflow Execution(workflow portlet) - The jobs are executed on the selected Grids and resources • -The arrows represent the file transfers among the sites • The blocks correspond jobs of the workflow -When the workflow has been completed the results are downloadable in zip format • - After the submission the status of the workflow’s jobs will be initialized • White color represents the INIT status -Red color represents the RUNNING status - Green color depicts the FINISHED status -Portal does on-line workflow animation according to the status of each job
Workflow Execution(workflow portlet) • The portal displays the list of jobs and their status • The current status of the jobs are represented by colors • It also provides access to their logs and outputs, and visualizes them White/Red/Green color means the job is initialised/running/finished
Main interactions 4. Certificate server Remote Clusters to be controlled WORKFLOW MANAGER (submit) CERTIFICATE (download) CERTIFICATE (upload) EDITOR (save|upload) Portal server Workflow (result) EDITOR (open) WORKFLOW MANAGER (output)
Workflow Execution(workflow portlet) • When the jobs are finished the results could be downloaded in zip format • The “Output” button displaysthe output of each job White/Red/Green color means the job is initialised/running/finished
User concerns of Grid systems • How to cope with the variety of these Grid systems? • How to develop new Grid applications? • How to execute Grid applications? • How to observe the application execution in the Grid? • How to tackle performance issues? • How to port legacy applications • to Grid systems • between Grid systems? • How to execute Grid applications over several Grids in a transparent way?
On-Line Monitoring(workflow portlet) • The portal also monitors and visualizes jobs and processes of workflows • The portal monitors and displays workflows • The portal also generates a statistic view
User concerns of Grid systems • How to cope with the variety of these Grid systems? • How to develop new Grid applications? • How to execute Grid applications? • How to observe the application execution in the Grid? • How to tackle performance issues? • How to port legacy applications • to Grid systems • between Grid systems? • How to execute Grid applications over several Grids in a transparent way?
Today’s reality:Technology specific Grid portals NorduGrid GS GS GS GS Grid portal GT-3 GS GS LCG-2 GS GS GS GS GS GS GS Web browser Grid portal HCG GS HCG GT3 GS GS GS GS GS Grid portal LCG-2 GT3 GS GS GS GS GS GS GS GS GS GS LCG-2
Future Grid portals: Functionality specific portals GS GS GS GS Medical Grid portal GS GS GS GS GS GS GS GS GS Web browser Computa-tional Grid portal GS GS GS GS GS GS Physics Grid portal GS GS GS GS GS GS GS GS GS GS
Future Grid portals: Functionality specific portals This is the P-GRADE Portal! GS GS GS GS Medical Grid portal GS GS GS GS GS GS GS GS GS Web browser Computa-tional Grid portal GS GS GS GS GS GS Physics Grid portal GS GS GS GS GS GS GS GS GS GS
Future multi-Grid portals GS GS GS GS GS GS WS GS GS GS GS GS GS GS GS GT3 Jini GS GS GS GS GS GT3 GS GS GS GS GS GS GS GS WS GS GS
The problem of current portals • They tightly connected and tailored to only one particular Grid • If the user wants to move to another Grid she has to learn the new environment • They do not support the simultaneous access of several Grids • P-GRADE portal release 2.1 solves these problems
Multi-Grid portals Portal classification
MISI (current) Portals LCG-2 P-GRADE-Portal Rome London Athens
MIMI PortalP-GRADE portal (2.1) GridLab P-GRADE-Portal User can choose where to execute the workflow SEE-Grid Rome London Athens
MIMC PortalP-GRADE portal (2.1) GridLab P-GRADE-Portal Different jobs can be executed in different grids SEE-Grid London Rome Athens
- All jobs have properties window • This window contains the most important information about the job eg. Type, name, required process number, Grid and resource name for execution Workflow Editor
Workflow Execution(workflow portlet) White/Red/Green color means the job is initialised/running/finished