A Performance Study of Grid Workflow Engines

A Performance Study ofGrid Workflow Engines Corina Stratan Parallel and Distributed Systems Group Politehnica University of Bucharest Romania Alexandru Iosup and Dick Epema PDS Group Delft University of Technology The Netherlands IEEE/ACM Grid 2008, Tsukuba, JP.

Why are Grid Workflows Interesting? • Grids promise reliable and easy-to-use computational infrastructure for e-Science • Full automation from experiment design to final result • Often, automation = workflows • Jobs comprising inter-related computing and data-transfer tasks

Why is the Performance of RealGrid Workflow Engines Interesting? • For our users • Is this system suitable for its users? • Are other systems better? • For focusing on the right research problems • What are the interesting problems? System configuration? Which workflow characteristics? Other problems… • For simulation studies • Unrealistic assumptions limit the applicability of results.How scalable are GWFEs? What overheads do they have?

Problem: How to Assess the Performance of Grid Workflow Engines? • What do we want to assess? • Is testing in real environments appropriate? • What performance metrics are important? • What workflows to use? Our goal is to develop and validate a methodology for assessing GWFEs.

Outline • Introduction • Methodology for Testing GWFEs • The Methodology in Practice • Conclusion and Future Work

2. Methodology for Testing GWFEsWhat to Assess? • Traditional: raw performance metrics • Runtime, wait time, etc. • In addition, for Grids (failure-prone, complex environments): • OverheadWhat is the cost of using a GWFE? • StabilityDoes the system behave consistently? • ScalabilityDoes the system support grid-size workloads? • ReliabilityWhat is the impact of dynamic resource availability?

2. Methodology for Testing GWFEsIs Testing in Real Environments Appropriate? • Our approach (novel)Testing complete grid middlewarestacks in real grid environments. • Alternatives • Simulation [Ahmad & Kwok, JPDC’99] • Math. Analysis • Testing GWFEs in isolation (think unit vs. integration testing)

2. Methodology for Testing GWFEsWhat Performance Metrics are Important? • Overheads components: Oi, Oa, Os, Ost, Of • Raw performance: Makespan (MS), Speed-Up vs. Single/Infinite Machine, … • Stability: internal (MS IQR/Med.), overall (MS Range/Median) • Scalability, Reliability [see article]. Workflow Tasks Grid Workflow Engine Grid Resource Manager

2. Methodology for Testing GWFEsWhat Workflows to Use? Number of graph nodes Graph traversal height • No accepted workload; no real system traces. • Sources: related simulation work, Standard Task Graph Set, our investigation of test workflows from 2 long-term grid traces [CG Symp.’08], our model of grid bags-of-tasks validated with 7 long-term grid traces [HPDC’08].

3. The Methodology in Practice (Selected Results)Experimental Setup • Testing complete grid middleware stacks • Generic GWFE: a baseline GWFE implementation • 15 PCs, 2xP4@3.2GHz, 2GB RAM, 1Gbps Ethernet • Tools: MonALISA, ServMark = DiPerF + GrenchMark.

3. The Methodology in Practice (Selected Results)Overhead: Impact of WL Size and Type • Setup: DAGMan, empty jobs, C-4 (left) / many (right). • Oi >> Ost = Of. Internal state update very important. • S-1, S-3: many often updates lower system throughput.

!!!!!!!!!!!!!!!!!!!!!!!! 3. The Methodology in Practice (Selected Results)Raw Perf.: Performance vs. Consumption Karajan performs better than DAGMan, but runs quickly out of resources. Karajan DAGMan

3. The Methodology in Practice (Selected Results)Stability: Internal and Overall Stability • Setup: DAGMan, 10 independent runs, C-4, 10 WFs. • System is: • Internally stable • Overall not stable • Need to react to system dynamics to favor under-served workflows.

Conclusion and Future Work • Methodology for testing Grid Workflow Engines • Goals • Metrics • Workflows • Testing grid middleware stacks, not GWFEs in isolation! • Analysis of two much used GWFEs vs. a baseline GWFE • Future work • Apply method to more middleware stacks, in more environments • Design domain-specific workloads and assess the performance impact of the inter-domain differences (do different domains raise different challenges?)

Thank you! Questions? Remarks? Observations? • Contact: A.Iosup@gmail.com [google “Iosup“] • Web site: http://www.pds.ewi.tudelft.nl PDS group articles & software • Have (workflow-based) grid traces? • Additional References Help building our community’sGrid Workloads Archive: http://gwa.ewi.tudelft.nl [HPDC’08] A. Iosup, O. Sonmez, S. Anoep, and D.H.J. Epema, The Performance of Bags-Of-Tasks in Large-Scale Distributed Computing Systems, In IEEE HPDC'08, 2008. [CG Symp.’08] S. Ostermann, R. Prodan, T. Fahringer, and A. Iosup, On the characteristics of grid workflows, In CoreGRID Symp. 2008.

A Performance Study of Grid Workflow Engines

A Performance Study of Grid Workflow Engines

Presentation Transcript

INTO THE GRID A Study of the Power Grid

Grid workflow and parameter study applications by P-GRADE Portal

Grid Performance Engineering

INTO THE GRID A Study of the Power Grid

Service, Grid Service and Workflow

A portal interface to my Grid workflow technology

Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources

Use of Workflow Techniques for Grid Management

Grid Workflow within Triana

GSFL: A Workflow Framework for Grid Services

A Survey of Programming Frameworks for Dynamic Grid Workflow Applications

Performance analysis workflow

GridNexus A Grid Services Scientific Workflow System

Grid Workflow Tools, Techniques, Applications

A Concept of a Monitoring Infrastructure for Workflow-Based Grid Applications

Grid-based interoperability of workflow systems

Workflow in Grid Systems Workshop

A Survey of Programming Frameworks for Dynamic Grid Workflow Applications

Workflow languages and engines breakout

Outlet Hi performance engines

Performance of a possible Grid Message Infrastructure

Performance Characteristics of Turbo Jet Engines