GrenchMark:

GrenchMark: Synthetic workloads for Grids A. Iosup, D.H.J. Epema PDS Group, ST/EWI, TU Delft First Demo at TU Delft

Evaluating Grid schedulers performance • Grid schedulers performance • Qualitative metricssupported application types, advanced fault tolerance, advanced ... • Quantitative metricsresource consumption, system performance, success rate • Other metricscost, single number system description • Needs • Applications, workloads, more workloads…

Synthetic workloads for Grid schedulers • Good synthetic workloads for Grid schedulers • Specific scheduler comparison requirements(metrics, jobs inter-arrival time, ...) • Many different types of representative Grid applications • Traditional software engineering requirements(flexibility, extensibility, usability, ...) • Can also be used for... • Functionality testing and system tuning • Application performance testing • Systems design and procurement • …

Representative Grid applications • Unitary applications • Just one scheduling unit (otherwise recursive definition) • Examples:Sequential, MPI, Java RMI, Ibis, … • Composite applications • Composed of several unitary or composite applications • Examples:Parameter sweeps, chains of tasks, DAGs, workflows, …

Outline [done] • Introduction • The GrenchMark framework • Experience with GrenchMark • Extending GrenchMark • Conclusions [here]

The GrenchMark framework • What’s in a name?grid benchmark→help standardizing the testing procedures, but benchmarks are too early… • GrenchMarkA systematic approach to testing Grid schedulers • A set of metrics for comparing schedulers • A set of representative Grid applications • Both real and synthetic • Easy-to-use tools to create synthetic workloads • Flexible, portable, extensible • Can also be used for testing other Grid components

The GrenchMark framework

GrenchMark: Preliminary notions • Job, workload, workload unit • Job = Set of components (support for co-allocation)[Job = one program execution / the basic scheduling unit / … ] • Workload = Set of Jobs • Workload Unit = Set of jobs with the same property, generated from one description line (definition useful only for workload generator) • Other • JDF = Job description file • Inter arrival time

GrenchMark: Some more notions • Job site type / per site description • Single – run at one site • Fixed – run at several sites, all specifiedmy license is on those machines • Un-Fixed – run at several sites, all unspecifiedI can run anywhere, just give me the resources • Semi-fixed – run at several sites, some specifiedI prefer those machines, but I can work anywhere • Inter-arrival time distributionsConstant,Uniform, Normal, Exponential(λ), Poisson(Mean), HExp2, HPoisson2, Weibull, LogNormal, Gamma (~)

WL description: an example • Describe the workload to be generated in a few lines • Very simple language + custom extensions (Native) (ExternalFile field) • Support • Co-allocation • Start time, inter-arrival time • Mixes of jobs

1 1 2 2 The GrenchMark process • Sample run: • 4 lines of description • 100 jobs / 411 components • 100 files / 132 directories • 300KB data • Sample run: • defined inter-arrival rate – submission delay +/- 0.01s • 100 JDFs $ ./wl-gen.py wl-desc.in $ ./wl-submit.py out/wl-to-submit.wl Semi-automated

GrenchMark status • Already done in Python[http://www.python.org] • Generator + Globus, KOALA generators + RSL printer • Submitter • Results analyzer (crude) • Applications • Unitary, 3 types: sequential, MPI, Ibis (Java) • +35 different applications • Ongoing work • Composite applications • Automated results analyzer

Demo:Generating mixes of jobs • 10 jobs • 8 MPI, multi-component jobs (need co-allocation) • 2 sequential

Outline [done] • Introduction • The GrenchMark framework • Experience with GrenchMark • Extending GrenchMark • Conclusions [done] [here]

GrenchMark for testing KOALA… • Testing • 3 different runners: drunner, grunner, krunner • Pre-release status: supposed stable • Workloads with different jobs requirements, inter-arrival rates, co-allocated v. single site jobs… • Evaluate • Jobs success rate, KOALA’s overhead and bottlenecks • Results • +5,000jobs successfully run • 2 major bugs first day, +10 bugs overall (all fixed) • KOALA is officially released(full credit to KOALA developers, 10x for testing with GrenchMark)

GrenchMark for testing KOALA: A full workload example • KOALA test workload, run 10 times: • Globus MPICH-G2 / MPI jobs • Components: 4 and 8 • Component Sizes: 4, 8 and 16 • Inter-arrival time: Poisson(5s), spikes • Co-allocation, 1 site • Submit time 1 day • Generate: • Submit: • Total: 3200 jobs, 19200 components / 3k files, 4k dirs • Timing: 30s generate / 86,400s submit(1 day) $ wl-gen.py --duration=86400000 wl-desc.in $ wl-submit.py –onefile out/wl-desc.in

… and DAS-2’s functionality • Already done • Evaluate for KOALA + Globus + DAS-2 • jobs success rate, turnaround time, middleware overhead, types and sources of errors • Results • 5 workloads • 500 jobs A.Iosup, J.Maassen, R.V.van Nieuwpoort, D.H.J.Epema, Synthetic Grid Workloads with Ibis, KOALA, and GrenchMark, 2005 (submitted). • Currently • examine DAS-2 support for composite applications

Outline [done] • Introduction • The GrenchMark framework • Experience with GrenchMark • Extending GrenchMark • Conclusions [done] [done] [here]

Extending GrenchMark (1) • Motto:Extending GrenchMark is easy! • Need: • Good knowledge about the application type • Good understanding of workflows, orGood understanding of grid middleware • Minimal Python knowledge (follow the official 2hrs tutorial: http://docs.python.org/tut/tut.html)

Extending GrenchMark (2) • Write your own Job Generators • a function with a predefined name in a Python module • auto-loaded • Write your own Unit Generators • a function with a predefined name in a Python module • auto-loaded • Interface with C/C++, Ruby, Perl, Java, … • define your own protocol • Write your own printers • a function with a predefined name in a Python module • auto-loaded

Outline [done] • Introduction • The GrenchMark framework • Experience with GrenchMark • Extending GrenchMark • Conclusions [done] [done] [done] [here]

Conclusions and future work • GrenchMarkgenerates diverse workloads of Grid applicationseasy-to-use, flexible, portable, extensible, … • Experienceused GrenchMark to test KOALA’s functionality and performance. used GrenchMark to test some DAS Grid functionality. +5,000 jobs generated and run … and counting. • (more) advertismentHave specific Grid applications types you would like to test? Test with GrenchMark!

Thank you! Questions? Remarks? Observations? All welcome! Grenchmarkhttp://grenchmark.st.ewi.tudelft.nl/[10x Paulo]Alexandru IOSUP TU DelftA.Iosup@ewi.tudelft.nl http://www.pds.ewi.tudelft.nl/~iosup/index.html[google: “iosup”]

GrenchMark:

GrenchMark:

Presentation Transcript

GrenchMark : A Framework for Analyzing, Testing, and Comparing Grids