340 likes | 453 Views
Use of Performance Prediction Techniques for Grid Management. Junwei Cao University of Warwick April 2002. Outline. Introduction Performance Prediction Local Grid Scheduling Global Grid Management A Case Study Conclusions & Future Work. Introduction. Glossary Overview Related Works.
E N D
Use of Performance Prediction Techniques for Grid Management Junwei Cao University of Warwick April 2002
Outline • Introduction • Performance Prediction • Local Grid Scheduling • Global Grid Management • A Case Study • Conclusions & Future Work
Introduction Glossary Overview Related Works
Glossary • Grids – computational grids • Resources – multiprocessors or clusters • Applications (tasks) – MPI & PVM parallel programs • Users – developers and end users • Agents, Requests & Services • Performance – execution time
Application Tools Global Grid Management Performance Evaluation Engine Local Grid Scheduling Resource Tools Overview Grid Users Grid Resources
Related Works • Performance evaluation • POEMS, AppLeS, … • Local grid schedulers • Condor, LSF, Ninf, Nimrod, … • Global grid management • Globus, Legion, DPSS, …
Performance Prediction Methodology Implementation
Model Parameters Application Layer Subtask Layer Parallel Template Layer Hardware Layer Predicted Execution Time PACE Methodology acts as the entry point to the performance study describes the sequential parts within an application describes the parallel characteristics of subtasks characterises the comm. and comp. abilities of a particular system
Source Code Analysis Object Editor Object Library HMCL Compiler PSL Compiler CPU Network (MPI, PVM) Cache (L1, L2) PACE Toolkit Application Tools Evaluation Engine Resource Tools
Summary • Advantages • Reasonable accuracy • Rapid evaluation time • Easy cross-platform comparisons • Limitations • Application source codes required • Static resource configurations
Local Grid Scheduling Algorithms Implementation (Titan)
2n-1 FIFO Algorithm Processor 1 Processor 2 Processor 3 Processor 4 Processor 5 Processor 6 Processor 7 Processor 8
Genetic Algorithm • Heuristic • Evolutionary • Near-optimal: • Makespan • Idletime • Deadlines
Titan Implementation Requests Results Service Communication Module Task Management Task Execution Resource Monitoring GA Scheduling PACE Evaluation Engine
Global Grid Management Methodology Implementation (ARMS) Metrics
A A A A Agent-based Methodology • Agent structure • Communication layer • Decision-making layer • Local management layer • Agent hierarchy • Service advertisement • Service discovery • Agent Capability Tables A User
M Optimisation Strategies • Advertisement • Data-push & data-pull • Periodic & event-driven • Discovery • Local services • Services in ACTs • Lower or upper agents • Optimisation • Modelling • Simulation A User A A A A
M A A A A A T T T T T ARMS Implementation • Service information • PACE models • Makespan • Request information • Application binary • PACE model • Deadline • Matchmaking • Estimation (FIFO) • Deadline User
Metrics • Average advance time of application execution completions (compared to required deadlines) • Average processor utilisation rate – utilisedtime/totaltime • Load balancing level – mean square deviation of processor utilisation rates • Average discovery speed – #req./#disc.conn. • Average discovery efficiency – #req./(#disc.conn.+ #adver.conn.)
A Case Study Design Demonstrations Results
S1 (SGIOrigin2000, 16) S2 (SGIOrigin2000, 16) S5 (SunUltra5, 16) S4 (SunUltra10, 16) S3 (SunUltra10, 16) S12 (SunSPARCstation2, 16) S10 (SunUltra1, 16) S8 (SunUltra1, 16) S6 (SunUltra5, 16) S11 (SunSPARCstation2, 16) S7 (SunUltra5, 16) S9 (SunUltra1, 16) Experiment Design sweep3d fft improc closure jacobi memsort cpi
FIFO FIFO FIFO FIFO FIFO Experiment 1
FIFO FIFO FIFO FIFO FIFO Experiment 1
GA GA GA GA GA Experiment 2
GA GA GA GA GA Experiment 3
Application Execution Both GA and agents contribute towards the improvement in application executions.
Resource Utilisation S11 & S12 benefit mainly from the GA. S1 & S2 benefit mainly from agents.
Load Balancing The GA contributes more to local grid load balancing. Agents contribute more to global grid load balancing.
Discovery Speed & Efficiency Discovery speed (*100) No advertisement: Low speed Low efficiency Reasonable advertisement: High speed High efficiency Discovery efficiency (*100) Too much advertisement: Very high speed Very low efficiency
Conclusions • Performance prediction capabilities are essential to grid management. • Genetic algorithm is applied for local grid scheduling. • Global grid management is achieved using an agent-based methodology. • Agent-based framework is scalable, flexible, extensible and easy for further enhancement.
Future Work • Impact of prediction accuracy on grid management and scheduling • Transaction-based application performance modelling • Integration with Globus and NWS • More than discovery, enabling negotiation and coordination
Agent Coordination Performance Prediction PACE Agent Negotiation Application Scheduling AppLeS Agent Discovery Information Service Globus Grid Monitoring NWS Agent Communication Knowledge Representation … … JAG: Java Agents for Grids