Application of Methods of Queuing Theory to Scheduling in GRID

Application of Methods of Queuing Theory to Scheduling in GRID A Queuing Theory-based mathematical model is presented, and an explicit form of the optimal control procedure obtained as the solution to the problem of maximizing the system throughput.

Why Queuing Theory? • Indeed, there are queues in real GRIDs • The services GRIDs offer to end users much resemble the services offered by telephone networks, the typical subject of study in Queuing Theory • The complexity of the associated processes leaves little options but to use the probabilistic techniques

Complexity: The Principal Limiting Factor to Modeling • GRIDs are very complicated systems themselves • GRIDs are composed of smaller complicated systems • Computer hardware • Networks • Software • GRIDs are embedded into the larger complicated systems: • Scientific organizations • R&D activities • Globalization processes

Stopping Decomposition as Soon as Possible to Avoid Unnecessary Complexity • Demarcate the phenomena specific to scheduling in GRID, and the generic phenomena • Model complicated behavior of the components with probabilistic techniques • Find the most general expression of the effects

Ultimate Stopper of Decomposition No entity in the modeled system should be decomposed, if the system persists when that entity is replaced with another similar one.

Implications • There is no need to develop detailed models of computers, networks, software or interaction external to GRID • There is no need to model the intra-GRID interaction, which does not directly affect scheduling • Information about how long it will take to process a demand on each node is all we need to know about the demand.

Mathematical Concepts Involved • Probability • Poisson Process • Multivariate Distribution • Linear Programming • Convergence “By Law”

Simplified Model: • There is a finite number of classes of demands (all demands from the same class have equal complexity) • Sub-Model of Structure: • Set of N nodes with queues • Sub-Model of Flow of Demands • Poisson process of arrivals with intensity  • M classes of demands • Sub-Model of Scheduling Procedure • Recognizes distinct classes of demands and routes the demands to the nodes it chooses

Sub-Model: Structure

Sub-Model: Flow of Demands • Demands from class j arrive with intensity j= pj (1 +…+m= ) • Upon arrival, a demand from class j is routed to node i with probability si,j • A demand from class j requires i,junits of processing time, if routed to node i • The computing time is “incompressible”: processing two demands with complexities T1 and T2 at a particular node requires T1+T2 time units independently of the order (or level of parallelism) in which they are processed

Two Important Facts About Poisson Processes • Let X1 and X2 be independent Poisson processes with intensity 1 and 2.Then X1+ X2 is a Poisson process with intensity 1+2. • Suppose a Poisson process X with intensity  is split into X1 and X2. With probability p events are passed to X1 and otherwise to X2. Then X1 and X2 are Poisson processes with intensities p and (1-p).

Flow of Demands & Scheduling Procedure

Sub-Model: Scheduling Procedure • The GRID operates in a stable environment • Routing of any demand in each moment depends on the current state of the system only • For all nodes load i<1  • The system can operate in the stationary mode • The stationary mode is stable

Stationary Mode

Implications of Stationary Operation • Incoming demands of class j are routed to node i with stationary probability si,j • Load of node i has the form i =   si,j i,j pj < 1

Optimization Problem

Linear Programming • It is possible to rewrite the constraints in the folowing form: • ’i =  si,j i,j pj • ’i  ’ • ’min • Now it is an LP problem

From Simplified to Real-World Model • How to handle non-discrete distributions of demands? • How to handle errors in classification (imperfect information)? • What about non-stationary modes? • Short-term excesses are not fatal because of stability • Long-term changes in distribution of demands can render the S.P. non-optimal

Approximating Actual Distribution of Demands with A Discrete Distribution

A Better Approximation

Simplified s is a matrix s: NxM[0,1] : NxM[0,) i =   si,j i,j pj Marginal s is a function si: RM[0,1] : multivariate random value (in RM ) i = E isi() What Happens When M?

Handling Imperfect Information • Average values of i,j can be used • The scheduling procedure should be iteratively re-evaluated when more information becomes available • In the real world applications, the exact distribution of demands is unknown, but can be approximated from the history of the system operation

A Comparison • Let  be an exponentially distributed random value with average 1 • i,j =1+ • Trivial procedure distributes demands with equal probability to any node • An optimized procedure is obtained as shown

Scheduling: Trivial vs. Optimized Maximum Throughput Optimized Trivial Num. of Nodes

Conclusions • The exact upper bound of throughput for a given GRID can be estimated • A scheduling procedure which achieves this limit can be constructed from a solution of an LP problem • The optimal scheduling procedure should be non-deterministic • Trivial and deterministic schedulers are generally unlikely to achieve the theoretical limit

References • L. Kleinrock, “Queueing Systems”, 1976 • Andrei Dorokhov, “Simulation simple models and comparison with queueing theory”http://csdl.computer.org/comp/proceedings/hpdc/2003/1965/00/19650034abs.htm • Atsuko Takefusa, Osamu Tatebe, Satoshi Matsuoka, Youhei Morita, “Performance Analysis of Scheduling and Replication Algorithms on Grid Datafarm Architecture for High-Energy Physics Applications” • GNU Linear Programming Kit, http://www.fsf.org

My Special Thanks To: • Dr. V.A. Ilyin for directing my work in the field of GRID systems • Prof. A.N. Shiryaev for directing my work in the Theory of Probability

Application of Methods of Queuing Theory to Scheduling in GRID

Application of Methods of Queuing Theory to Scheduling in GRID

Presentation Transcript

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

APPLICATION OF QUEUING THEORY TO WASHU DINING

Queuing Theory

Queuing Theory

Queuing Theory

Application Scheduling in a Grid Environment

Queuing Theory

QUEUING THEORY

Queuing Theory

Queuing Theory

Queuing theory

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

QUEUING THEORY