210 likes | 301 Views
WS-PGRADE: Supporting parameter sweep applications in workflows. Péter Kacsuk, Kriszti á n Karóczkai, Gábor Hermann, Gergely Sipos, and József Kovács MTA SZTAKI. Content. Motivations Lessons learnt from P - GRADE portal Lessons learnt from CancerGrid Workflow concept of gUSE/WS-PGRADE
E N D
WS-PGRADE: Supporting parameter sweep applications in workflows Péter Kacsuk, Krisztián Karóczkai,Gábor Hermann, Gergely Sipos, and József Kovács MTA SZTAKI
Content • Motivations • Lessons learnt from P-GRADE portal • Lessons learnt from CancerGrid • Workflow concept of gUSE/WS-PGRADE • Parameter sweep support of gUSE • CancerGrid • Executing PS nodes of gUSE workflows in desktop grids • Conclusions
Popularity of P-GRADE portal • It has been used in many EGEE and EGEE-related VOs: • GILDA, VOCE, SEE-GRID, BalticGrid, BioInfoGrid, EGRID, etc. • It has been used in many national grids: • UK NGS, Grid-Ireland, Turkish Grid, Croatian Grid, Grid Malaysia etc. • It has been used as the GIN VO Resource Testing Portal • It became OSS in the beginning of Januar 2008: https://sourceforge.net/projects/pgportal/
Download of OSS P-GRADE portal 828 downloads so far
Lessons learnt from P-GRADE portal • Popular because it provides • Easy-to-use but powerful workflow system (graphical editor, wf manager, etc.) • Easy-to-use parameter sweep concept support • Easy-to-use MPI program execution support • Grid virtualization: • Multi-grid/multi-VO access mechanism for LCG-2, gLite, GT2 and GT4
Parallel execution inside a workflow node • Parallel execution among workflow nodes Multiple jobs run parallel Each job can be a parallel program Introducing three levels of parallelism Multiple instances of the same workflow with different data files • Parameter study execution of the workflow
This could be any workflow GEN Grid job generates input parameter space SEQ SEQ SEQ SEQ Parameter sweep grid jobs COLL Collector grid job evaluates the results of the simulation Parameter study workflow
3-phase PS execution in P-GRADE portal First phase: executing ones all the Generators Second phase: executing all generated eWorkflows in parallel Last phase: executing ones all theCollectors
CancerGrid workflow needs more • Usage of generators and collectors at any node of the WF without any ordering restrictions • Usage the PS execution at node-level at any node of the WF without any ordering restrictions
CancerGrid workflow needs more N = 30K, M = 100 => about 0.5 year execution time x1 NxM= 3 million x1 xN xN xN NxM NxM x1 xN xN N=30K xN Generator job Generator job NxM= 3 million
Solution of the problem • We need an environment where the user can develop and execute such a workflow • The environment should contain a broker that decides where to execute the nodes • MPI nodes on SG clusters • Nodes with very short execution time on local resources • Seq. nodes with small number of invocations at SGs • Seq. nodes called many times at DGs • Such an environment for SGs is: • gUSE: provides a high-level service set based middleware • WS-PGRADE: provides a workflow user interface
gUSE and WS-PGRADE • gUSE (grid User Support Environment) • is a grid virtualization environment • exposes the grid as a workflow • enables the execution of workflows simultaneously in many grids no matter what their middleware is • WS-PGRADE is the user interface to support • Editing, configuring, publishing workflows (as grid applications)
PS workflow concept of WS-PGRADE • Any node of the workflow can be: • PS job • Generator • Collector • There are two kinds of relationship between input files of PS nodes: • Cross product • Dot product
Workflow Graph Overview in WS-PGRADE Input Port Node: job, service call (WS, legacy), wf Output Port The Workflow Editor as it appears for the user
*K Legend: Cross Product Dot Product Configuring the Workflow Specify the number of input files on external input Ports m n h Generator job produces multiple data on the output port within one job submission step SpecifyDot or Cross product relation of Input ports to define the number of job submissions 1 Specifyjob to be Collector by defining a Gathering Input Port. The Job execution will be postponed until all input files have arrived to that port
*K h m n m*n*h*K S S h m*n S Animation the number of generated output files Generator job runs h times and each run generates K files on the output port m*n h*K In case of dot product the job is submitted with input files having a common index number in each input port m*n h*K m*n h*K m*n*h*K S S=max(m*n,h*k) 1 1 S In case of cross product separate job submission is generated for each possible input file combination S S S S
The user concern • I have a large workflow containing: • Sequential nodes to be executed once • Sequential nodes to be executed many times (PS) • MPI nodes to be executed once • MPI nodes to be executed many times (PS) • I want to execute this workflow as fast as possible using as many resources as possible
Execution in the private DG of CancerGrid project Execution as Web Service Execution in a local resource Execution in EDGeS VO of EGEE NxM= 3 million x1 x1 xN xN xN NxM x1 xN xN N=30K xN NxM Generator job Generator job NxM= 3 million
Appl. Repository WS-PGRADE gUSE Service Grid EGEE Service Grid OSG GlobalDEG LocalDEG LocalDEG LocalDEG Putting everything together gUSE/WS-PGRADE provides the transparent access to SGs/DGs University DG Volunteer DG LocalDEG
Family of P-GRADE products and their use • P-GRADE • Parallelizing applications for clusters and grids • P-GRADE portal • Creating simple workflow and parameter sweep applications for grids • P-GRADE/GEMLCA portal • Creating workflow applications using legacy codes and community codes from repository • gUSE/WS-PGRADE • Creating complex workflow and parameter sweep applications to run on clusters, service grids and desktop grids • Creating workflow applications using embedded workflows, legacy codes and community workflows from workflow repository
Conclusions • gUSE and WS-PGRADE solve all the limitation problems of P-GRADE portal: • Implementation of gUSE is highly scalable, can be distributed on a cluster or even on different grid sites. • Stress tests show that it can simultaneously serve thousands of jobs (currently manages ~100,000 jobs in CancerGrid) • Its workflow concept is much more expressive than in P-GRADE portal (recursive wf, generic PS support, etc.) • WS-PGRADE provides two user interfaces: • Developer (creates and exports WFs into the WF repository of gUSE) • End-user (imports and executes WFs from the WF repository) • gUSE provides grid virtualization at workflow level: nodes of a WF can be executed by • Web Services, local resources, service grids and desktop grids (see EDGeS project)