230 likes | 307 Views
On Grid Performance Evaluation using Synthetic Workloads. Carsten Franke, Alexander Papaspyrou, Lars Schley, Baiyi Song, and Ramin Yahyapour. Alexandru Iosup , Dick Epema. PDS Group, ST/EWI, TU Delft. PDS Group, ST/EWI, TU Delft. JSSPP 2006. Outline. A Short Introduction to Grid Computing
E N D
On Grid Performance Evaluation using Synthetic Workloads Carsten Franke, Alexander Papaspyrou, Lars Schley, Baiyi Song, and Ramin Yahyapour Alexandru Iosup, Dick Epema PDS Group, ST/EWI, TU Delft PDS Group, ST/EWI, TU Delft JSSPP 2006
Outline • A Short Introduction to Grid Computing • On Grid Performance Evaluation • Experimental Environments • Performance Indicators • General Workload Modeling • Grid-Specific Workload Modeling • The GrenchMark Framework • Future Work • Conclusions
A Short Introduction to Grid Computing • Typical grid environment • Applications [!] • Unitary, composite • Data • Resources • Compute (Clusters) • Storage • (Dedicated) Network • Virtual Organizations, Projects • Groups, Users • Grids vs. parallel production environments • Dynamic • Heterogeneous • Very large-scale (world) • No central administration →Most resource management problems are NP-hard
Experimental Environments Real-World Testbeds • Real-World Testbed • DAS, NorduGrid, Grid3/OSG, Grid’5000… • Pros • True performance, also shows “it works!” • Infrastructure in place • Cons • Time-intensive • Exclusive access (repeatability) • Controlled environment problem (limited scenarios) • Workload structure (little or no realistic data) • What to measure (new environment)
Experimental Environments Simulated and Emulated Testbeds • Simulated and Emulated Testbeds • GridSim, SimGrid, GangSim, MicroGrid … • Essentially trade-off precision vs. speed • Pros • Exclusive access (repeatability) • Controlled environment (unlimited scenarios) • Cons • Synthetic Grids: What to generate? How to generate? Clusters, Disks, Network, VOs, Groups, Users, Applications, etc. • Workload structure (little or no realistic data) • What to measure (new environment) • Validity of results (accuracy vs. time)
Grid Performance Evaluation Current Practice • Performance Indicators • Define my own metrics, or use U and AWT/ART, or both • Workload Structure • Run my own workload, or use traces that are not validated by peer researchers; do not make comparisons! • Run benchmarks from typical parallel production environments • Mostly all users are created equal assumption Need a common performance evaluation framework for Grid
Grid Performance Evaluation Current Issues • Performance Indicators • What are the metrics for the new environment? • Workload Structure • Which general aspects are important? • Which Grid-specific aspects need to be addressed? Need a common performance evaluation framework for Grid
Performance Indicators • Time-, Resource-, and System-Related Metrics • Traditional: utilization, A(W)RT, A(W)WT, A(W)SD • New: waste, fairness (or service quality reliability) • Workload Completion and Failure Metrics “ In Grids, functionality may be even more importantthan performance ” • Workload Completion (WC) • Task and Enabled Task Completion (TC, ETC) • System Failure Factor (SFF)
General Aspects for Workload Modeling • User/Group/VO model • Detailed modeling for top-5/10 users, then clustering (Use squash area to group) • Submission patterns • Yearly, monthly, weekly, daily • Do daily patterns exist? (Are Grids truly global?) • Temporal patterns • Repeated submission (batches of jobs) • Job dependencies (composite applications common in Grid(?)) • Feedback • Empiric rules (don’t submit jobs when system busy). But, reactive submission tools, co-allocators, evolving applications, etc.
Grid-Specific Workload ModelingComputation Management • Processor co-allocation • Fixed, non-fixed, semi-fixed jobs • Job flexibility • Moldable, evolvable, flexible, *-ble… • Other aspects • Background load: define top jobs (by consumption), model the rest as background load • Project stage
Grid-Specific Workload ModelingData Management • Clearly Defined I/O Requirements • Files, streams, … • Data location and size • Replicas • Replica location • Other aspects • …
Grid-Specific Workload ModelingNetwork Management • Clearly Defined Network Requirements • Bandwidth, latency, … • Communication pattern • Special Situations • Dedicated paths, other QoS • Other aspects • Background load
Grid-Specific Workload ModelingLocality/Origin Management • Job issuer and execution site Not all VOs are created equal ! • Two-level view: Which VO generates the next job? Within a VO, which user generates the next job? • Three-level view, Multi-level view (Project, VO, Group, User) • (Usage) Service Level Agreements • Use my system 50% for 7 days, or 20% for 30 days • Dedicated paths, other QoS • Other aspects • Background load pertaining to same (u)SLA
Grid-Specific Workload ModelingFailure Modeling • Error level • Infrastructure • Middleware • Application • User • Fault tolerance scheme for submitted jobs • Catch the system feedback into the model • Other aspects • …
Grid-Specific Workload ModelingEconomic Models • Pricing • Application cost • Application utility • Other aspects • …
GrenchMark: a Framework for Analyzing, Testing, and Comparing grids • What’s in a name?grid benchmark→ working towards a generic tool for the whole community: help standardizing the testing procedures, but benchmarks are too early; we use synthetic grid workloads instead • What’s it about?A systematic approach to analyzing, testing, and comparing grid settings, based on synthetic workloads • A set of metrics for analyzing grid settings • A set of representative grid applications • Both real and synthetic • Easy-to-use tools to create synthetic grid workloads • Flexible, extensible framework
GrenchMark Overview: Easy to Generate and Run Synthetic Workloads
Workload structure User-defined and statistical models Dynamic jobs arrival Burstiness and self-similarity Feedback, background load Machine usage assumptions Users, VOs Metrics A(W) Run/Wait/Resp. Time Efficiency, MakeSpan Failure rate [!] (Grid) notions Co-allocation, interactive jobs, malleable, moldable, … Measurement methods Long workloads Saturated / non-saturated system Start-up, production, and cool-down scenarios Scaling workload to system Applications Synthetic Real Workload definition language Base language layer Extended language layer Other Can use thesame workload for both simulations and real environments … but More Complicated Than You Think
GrenchMark: Iterative Research Roadmap Simple functional system A.Iosup, J.Maassen, R.V.van Nieuwpoort, D.H.J.Epema, Synthetic Grid Workloads with Ibis, KOALA, and GrenchMark, CoreGRID IW, Nov 2005.
GrenchMark: Iterative Research Roadmap Open- GrenchMark CommunityEffort Complex extensible system A.Iosup, D.H.J.Epema, GrenchMark: A Framework for Analyzing, Testing, and Comparing Grids, IEEE CCGrid'06, May 2006.
Take home message • Performance Evaluation of Grid Systems - need a common performance evaluation framework for grids- need real grid traces (scheduling, accounting, monitoring, etc.)- need more research on workload modeling and performance indicators • Performance indicators - failure metrics as important as traditional performance metrics • Workload modeling - generic workload modeling needs validation based on real grid traces- computation/data/network management- locality/origin management- failure modeling- economic models • GrenchMark- generic tool for the whole community- generates diverse grid workloads- easy-to-use, flexible, portable, extensible, …
Thank you! Questions? Remarks? Observations? All welcome! GrenchMarkhttp://grenchmark.st.ewi.tudelft.nl/ http://grenchmark.st.ewi.tudelft.nl/