520 likes | 669 Views
P-GRADE Portal: Towards a User-friendly Grid Environment. Gergely Sipos. MTA SZTAKI, Hungary sipos@sztaki.hu www.lpds.sztaki.hu/pgportal pgportal@lpds.sztaki.hu. Technology concerns of Grid systems. Fast evolution of Grid systems and middleware:
E N D
P-GRADE Portal: Towards a User-friendly Grid Environment Gergely Sipos MTA SZTAKI, Hungarysipos@sztaki.huwww.lpds.sztaki.hu/pgportal pgportal@lpds.sztaki.hu
Technology concerns of Grid systems • Fast evolution of Grid systems and middleware: • GT1, GT2, OGSA, GT3 (OGSI), GT4 (WSRF), LCG-2, gLite, … • Many Grid systems are built based on these different technologies • EGEE (LCG-2), UK NGS (GT2), Open Science Grid (GT3), etc.
The P-GRADE Grid Portal gives the answer! Grid systems for HPC – User concerns • How to cope with the variety of Grid systems? • How to develop/create new Grid applications? • How to execute Grid applications? • How to observe the application execution in the Grid? • How to tackle performance issues? • How to execute Grid applications over several Grids in a transparent way?
Properties of the P-GRADE Portal • General purpose, workflow-oriented computational Grid portal. Supports the development and execution of workflow-based Grid applications. • Support for multi-grid workflows • GridSphere based • Easy to expand with new portlets (e.g. application-specific portlets) • Easy to tailor to end-user needs • Grid middleware services supported by the portal:
What is a P-GRADE Portal workflow? • a directed acyclic graph where • Nodes represent jobs (executable batch programs) • Ports represent input/output files the jobs expect/ produce • Arcs represent file transfer between the jobs • semantics of the workflow: • A job can be executed if all of its input files are available • local input files: on the portal server • remote input files: on storage elements
Parallel execution inside a workflow node • Parallel execution among workflow nodes Multiple jobs can run parallel The job can be a parallel program Two levels of parallelism by a workflow • The P-GRADE Portal workflow concept enables the efficient parallelization of complex problems • Semantics of the workflow enables two levels of parallelism:
Ultra-short range weather forecast (Hungarian Meteorology Service) Forecasting dangerous weather situations (storms, fog, etc.),crucial task in the protection of life and property 25 x Processed information: surface level measurements, high-altitude measurements, radar, satellite, lightning, results of previous computed models 10 x 5 x 25 x • Requirements: • Execution time < 10 min • High resolution (1km)
The problem of current portals • They tightly connected and tailored to only one particular Grid (eg. NGS portal, NorduGrid portal) • If the user wants to move to another Grid • (She has to obtain certificate for the new Grid) • She has to register for the new Grid • She has to get an account for its portal • She has to learn the new environment • She has to copy the grid files & modify the application • P-GRADE Portal release 2.1 and above solve these problems: • (Obtain a certificate for the new Grid) • Register for the new Grid • Map some of the jobs of your workflow onto resources of this Grid
Different jobs of a workflow can be executed in different grids Multi-Grid P-GRADE Portal EGEE Gride.g. VOCE P-GRADE-Portal The portal can be connected to multiple grids UK NGS London Rome Athens
SAVE WORKFLOW OPEN EDITOR The typical P-GRADE Portal scenarioPart 1 - development phase Certificate servers Gridservices Portal server DEFINE GRID ENVIRONMENT OPEN & EDIT or DEVELOP WORKFLOW
TRANSFER FILES, SUBMIT JOBS DOWNLOAD PROXY CERTIFICATES MONITOR JOBS VISUALIZE JOBS and WORKFLOW PROGRESS DOWNLOAD RESULTS DOWNLOAD RESULTS The typical P-GRADE Portal scenarioPart 2 - execution phase Certificate servers Gridservices Portal server
Developing workflows with the P-GRADE Portal Main steps • Define the Grid environment • Define the workflow
The typical P-GRADE Portal scenarioDevelopment phase – step 1: Certificate servers Gridservices Portal server DEFINE THE GRID ENVIRONMENT
Resource Manager(settings portlet) • To define which computational resources my workflows will use • Two levels: • Define grids or VOs administrator • Name(e.g. EGrid) • Information system(e.g. egrid-2.egrid.it) • Define Computational resources for each grid: • Automatically from information system (only from MDS-2) • Centrally by the administrator • Individually by each user
Resource Manager(settings portlet – user view) List of available grids To define computational resources for such a grid
Resource Manager(settings portlet – user view) • Every computational resource is identified by a • host name • port number (or use default) • local jobmanager (queue name) • e.g. egrid-3.egrid.it/jobmanager-fork
The typical P-GRADE Portal scenarioDevelopment phase – step 2: Certificate servers Gridservices SAVE WORKFLOW Portal server OPEN EDITOR OPEN & EDIT or DEVELOP or IMPORT WORKFLOW
Workflow developmentopening the workflow editor The editor is a Java Webstart application dynamic download and installation!
Workflow Editordefining the graph • The aim is to define a DAG of batch jobs: • Drag & drop components:jobs and ports • Define their properties • Connect ports by channels(no cycles, no loops, no conditions)
Workflow Editordefining the jobs • Define the job: • Executable file • Executable type • Number of required processors • command line params. • The resource to be used for the execution: • Grid • (Comp. resource)
I still don’t know which resource to use! Which resource to use? The information system portlet helps characterize resources!
Automatic resource selectionSince P-GRADE Portal v2.2 • Describe the requirements of the job • Select a LCG-2 middleware based Grid (e.g. VOCE) for it • The workflow manager will use the broker of that Grid during the execution to find the best resource for the job
Workflow Editordefining jobs in v2.2 Select an LCG-2 based Grid (*_LCG_2_BROKER)! Ignore the resource field! Define optional requirements using the built-in JDL editor!
Workflow EditorJDL editor in v2.2 JDL look at the LCG-2 Users’ manual!
Workflow Editordefining the ports Type: input: the job requires output: the job produces File type: local: from/to my desktop remote: from/to a storage resource File: location of the file Storage type: Permanent: final result of the WF Volatile: just inter-job data transfer
Client side location: c:\experiments\11-04.dat Grid Unique IDentifier (GUID): guid:1fd75fdf-dccc-4603-998b-e17facb0d034 RLS logical file name: lfn:/sipos_11_04.dat LFC logical file name– NOT IN VOCE! lfn:/grid/egrid/sipos/11-04.dat Local files Remote files Location of files Input file Output file • Client side location: result.dat • RLS logical file name: lfn:/sipos_11_04-result.dat • LFC logical file name– NOT IN VOCE! lfn:/grid/egrid/sipos/11-04-result.dat
LOCAL INPUT FILES & EXECUTABLES LOCAL INPUT FILES& EXECUTABLES REMOTE INPUTFILES REMOTE OUTPUTFILES LOCAL OUTPUT FILES LOCAL OUTPUT FILES Only the permanent files! Local vs. remote files Gridservices Storage resources Portal server Comp. resources
Workflow Editorsaving the workflow Workflow has been defined! Let’s execute it!
Executing workflows with the P-GRADE Portal Main steps • Download proxies • Submit workflow • Observe workflow progress • If some error occurs correct the graph • Download result
The typical P-GRADE Portal scenarioExecution phase – step 1: Certificate servers Gridservices DOWNLOAD PROXY CERTIFICATES Portal server
Certificate Managercertificates portlet • To access GSI-based Grids the portal server application needs proxy certificates • “Certificates” portlet: • to upload X.509 certificates into MyProxy servers • to download short-term proxy credentials into the portal server application
Certificate Managerdownloading a proxy • MyProxy server access details: • Hostname (egrid-1.egrid.it) • Port number (7512) • User name (from upload) • Password (from upload) • Proxy parameters: • Lifetime • Comment
Certificate Managerassociating the proxy with a grid This operation displays thedetails of the certificateand the list of available Grids
Certificate Managerbrowsing proxies Multiple proxies can be available on the portal server at the same time! Comp. resources of HUNGRID Comp. resources of SEE-GRID
TRANSFER FILES, SUBMIT JOBS The typical P-GRADE Portal scenarioExecution phase - step 2: Certificate servers Gridservices Portal server
Workflow Management(workflow portlet) • The portlet presents the status, size and output of the available workflow in the “Workflow” list • It has a Quota manager to control the users’ storage space on the server • The portlet also contains the “Abort”, “Attach”, “Details”, “Delete” and “Delete all” buttons to handle execution of workflows • The “Attach” button opens the workflow in the Workflow Editor • The “Details” button gives an overview about the jobs of the workflow
Workflow Execution(observation by the workflow portlet) White/Red/Green color means the job is initial/running/finished state
Workflow Execution(observation by the workflow portlet) White/Red/Green color means the job is initialised/running/finished
Workflow Execution I still don’t know what’s happening inside my workflow!
MONITOR JOBS VISUALIZE JOBS and WORKFLOW PROGRESS The typical P-GRADE Portal scenarioExecution phase – step 3: Certificate servers Gridservices Portal server
On-Line Monitoring both at the workflow and job levels (workflow portlet) • The portal monitors and displays workflows
On-Line Monitoring both at the workflow and job levels (workflow portlet) • The portal also monitors and visualizes parallel jobs(if they were developed with the P-GRADE Environment) • The portal also generates a statistical view
Rescuing a failed workflow 1.(from v2.2) Read the error log to know why A job failed during workflow execution
Rescuing a failed workflow 2.(from v2.2) Map the failed job onto a different resource or download a new proxy for it. Don’t touch the finished jobs! The execution can continue from the point of failure!
DOWNLOAD RESULTS DOWNLOAD RESULTS The typical P-GRADE Portal scenarioExecution phase – step 5 Certificate servers Gridservices Portal server
Grid systems for HPC – User concerns • How to cope with the variety of Grid systems? • How to develop/create new Grid applications? • How to execute Grid applications? • How to observe the application execution in the Grid? • How to tackle performance issues? • How to execute Grid applications over several Grids in a transparent way?
References • Official portal of • SEE-GRID infrastructure • VOCE infrastructure • HUNGRID infrastructure • P-GRADE portal is available as service for: • Croatian Grid • UK National Grid Service • EGrid (Italy)
How to access P-GRADE portal? • If you are interested in using P-GRADE Portal: • Take a look at www.lpds.sztaki.hu/pgportal(slideshows, manuals, etc.) • Get an account for one of its production installations: • VOCE portal - SZTAKI • SEEGRID portal – SZTAKI • HUNGrid portal – SZTAKI • NGS portal – University of Westminster • If you are the administrator of a Globus/LCG-2 based Grid/VO then ask SZTAKI to install the P-GRADE Portal for you! • If you know the administrator of a P-GRADE Portal you can ask him/her to give access to your Grid through his/her portal installation!
What more we can offer • GEMLCA-specific P-GRADE Portal: • Share legacy applications as jobs with members of a community • Portal service for the UK NGSwww.cpc.wmin.ac.uk/ngsportal • LCG-2 extension will be added by the end of 2005. • Collaborative P-GRADE Portal: • Develop workflows together with your colleagues in a real-time fashion! • Execute the different jobs with different user’s certificates • Will be available in 2006