Use of Performance Prediction Techniques for Grid Management

Use of Performance Prediction Techniques for Grid Management Junwei Cao University of Warwick April 2002

Outline • Introduction • Performance Prediction • Local Grid Scheduling • Global Grid Management • A Case Study • Conclusions & Future Work

Introduction Glossary Overview Related Works

Glossary • Grids – computational grids • Resources – multiprocessors or clusters • Applications (tasks) – MPI & PVM parallel programs • Users – developers and end users • Agents, Requests & Services • Performance – execution time

Application Tools Global Grid Management Performance Evaluation Engine Local Grid Scheduling Resource Tools Overview Grid Users Grid Resources

Related Works • Performance evaluation • POEMS, AppLeS, … • Local grid schedulers • Condor, LSF, Ninf, Nimrod, … • Global grid management • Globus, Legion, DPSS, …

Performance Prediction Methodology Implementation

Model Parameters Application Layer Subtask Layer Parallel Template Layer Hardware Layer Predicted Execution Time PACE Methodology acts as the entry point to the performance study describes the sequential parts within an application describes the parallel characteristics of subtasks characterises the comm. and comp. abilities of a particular system

Source Code Analysis Object Editor Object Library HMCL Compiler PSL Compiler CPU Network (MPI, PVM) Cache (L1, L2) PACE Toolkit Application Tools Evaluation Engine Resource Tools

Summary • Advantages • Reasonable accuracy • Rapid evaluation time • Easy cross-platform comparisons • Limitations • Application source codes required • Static resource configurations

Local Grid Scheduling Algorithms Implementation (Titan)

2n-1 FIFO Algorithm Processor 1 Processor 2 Processor 3 Processor 4 Processor 5 Processor 6 Processor 7 Processor 8

Genetic Algorithm • Heuristic • Evolutionary • Near-optimal: • Makespan • Idletime • Deadlines

Titan Implementation Requests Results Service Communication Module Task Management Task Execution Resource Monitoring GA Scheduling PACE Evaluation Engine

Global Grid Management Methodology Implementation (ARMS) Metrics

A A A A Agent-based Methodology • Agent structure • Communication layer • Decision-making layer • Local management layer • Agent hierarchy • Service advertisement • Service discovery • Agent Capability Tables A User

M Optimisation Strategies • Advertisement • Data-push & data-pull • Periodic & event-driven • Discovery • Local services • Services in ACTs • Lower or upper agents • Optimisation • Modelling • Simulation A User A A A A

M A A A A A T T T T T ARMS Implementation • Service information • PACE models • Makespan • Request information • Application binary • PACE model • Deadline • Matchmaking • Estimation (FIFO) • Deadline User

Metrics • Average advance time of application execution completions (compared to required deadlines) • Average processor utilisation rate – utilisedtime/totaltime • Load balancing level – mean square deviation of processor utilisation rates • Average discovery speed – #req./#disc.conn. • Average discovery efficiency – #req./(#disc.conn.+ #adver.conn.)

A Case Study Design Demonstrations Results

S1 (SGIOrigin2000, 16) S2 (SGIOrigin2000, 16) S5 (SunUltra5, 16) S4 (SunUltra10, 16) S3 (SunUltra10, 16) S12 (SunSPARCstation2, 16) S10 (SunUltra1, 16) S8 (SunUltra1, 16) S6 (SunUltra5, 16) S11 (SunSPARCstation2, 16) S7 (SunUltra5, 16) S9 (SunUltra1, 16) Experiment Design sweep3d fft improc closure jacobi memsort cpi

FIFO FIFO FIFO FIFO FIFO Experiment 1

GA GA GA GA GA Experiment 2

GA GA GA GA GA Experiment 3

Application Execution Both GA and agents contribute towards the improvement in application executions.

Resource Utilisation S11 & S12 benefit mainly from the GA. S1 & S2 benefit mainly from agents.

Load Balancing The GA contributes more to local grid load balancing. Agents contribute more to global grid load balancing.

Discovery Speed & Efficiency Discovery speed (*100) No advertisement: Low speed Low efficiency Reasonable advertisement: High speed High efficiency Discovery efficiency (*100) Too much advertisement: Very high speed Very low efficiency

Conclusions & Future Work

Conclusions • Performance prediction capabilities are essential to grid management. • Genetic algorithm is applied for local grid scheduling. • Global grid management is achieved using an agent-based methodology. • Agent-based framework is scalable, flexible, extensible and easy for further enhancement.

Future Work • Impact of prediction accuracy on grid management and scheduling • Transaction-based application performance modelling • Integration with Globus and NWS • More than discovery, enabling negotiation and coordination

Agent Coordination Performance Prediction PACE Agent Negotiation Application Scheduling AppLeS Agent Discovery Information Service Globus Grid Monitoring NWS Agent Communication Knowledge Representation … … JAG: Java Agents for Grids

Questions are welcome …

Use of Performance Prediction Techniques for Grid Management

Use of Performance Prediction Techniques for Grid Management

Presentation Transcript

Grid Performance Engineering

Stress Management for Busy Professionals: Techniques you can use

GPU Performance Prediction

Carcinogenicity prediction for Regulatory Use

Management for Performance

Exploiting Nonstationarity for Performance Prediction

Effective Performance Management and Use of Performance Metrics Refresher

Application and Performance Management Techniques for J2EE

Use of Workflow Techniques for Grid Management

Application Performance Profiling and Prediction in Grid Environment

GHS: A Performance Prediction and Task Scheduling System for Grid Computing

Grid for Coupled Ensemble Prediction (GCEP)

Performance Prediction Engineering

Performance Improvement For Plant Gene Prediction

Use of Tide Prediction Tables

Use Case Scenarios for Performance Control of Grid-based Metacomputing

Astrology Prediction Techniques

Exploiting Nonstationarity for Performance Prediction

Enabling Prediction of Performance

Branch Prediction Techniques