120 likes | 211 Views
Aspects of Grid Management: an abstract ORC-based approach. A Stewart, P Kilpatrick, M Clint*, R Perrott, T Harmer (QUB) J Gabarró (UPC). manager. Functionality + adaptability. Grid Application (e.g. Component)
E N D
Aspects of Grid Management: an abstract ORC-based approach A Stewart, P Kilpatrick, M Clint*, R Perrott, T Harmer (QUB) J Gabarró (UPC)
manager Functionality + adaptability • Grid Application (e.g. Component) • The dynamic behaviour of a manager can be described in Misra's orchestration language ORC. European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Overview • ORC is a small language (3 combinators + recursion) which can be used to describe succinctly some essential features of dynamic component management. • Current Work • Adding facilities for reasoning about the reliability of ORC expression evaluations. • The goal is to provide a framework for determining the likelihood of success of different management strategies. European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
CNN Q BBC ORC Example 1 • Consider a computation which involves extracting weather data from a web site and using this data in a computation. Q(x) w w x • Here the data may be supplied by either a BBC site or a CNN • site • Q(x) where x :Î {BBC.w |CNN.w} • The asymmetric operator (above) involves non-deterministic • thread selection and thread termination. European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
ORC Example 2 • Grid resources may be known to be busy at certain times – time dependent orchestrations can be specified. • Atimer > t > ( if (12.00 < t < 18.00) >> s1.f(x) | • if Ø(12.00 < t < 18.00) >> s2.f(x) ) • Here Atimer returns the current time. European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
ORC Example 3 • A call to a grid site may be unresponsive. In such circumstances the site or an alternative site may be (re)called. • FindW º • ( if (x=signal) >>FindW | • if Ø(x=signal) >> let(x) ) • where • x :Î { BBC.w | CNN.w | Rtimer(t) } • Here Rtimer(t) returns a signal after t time units. European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Reliability • An essential feature of ORC is that a site call may or may not respond. In a similar way a grid site may be operational or unresponsive (due to excessive load or network failure). • Orc ExpressionMeaning returns a result if • s is operational • E(S) • otherwise no response European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Performance • Pr( Ss ): probability that a call to s succeeds • Pr( Fs ): probability that a call to s fails. • In a grid, success might be interpreted as: • Site s and its network are operational. • Site s is operational and the network has acceptable bandwidth. • Site s is operational, can meet job requirements and has acceptable bandwidth. European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Conditional Probability • Sites which are known to be currently operational are more likely to be operational in the immediate future. • Let Pr( Ss | Ss) denote the probability that a call to s succeeds given that a recent call to s also succeeded. • Example: • s >> s • Its reliability is given by • Pr( Ss) * Pr( Ss | Ss) European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Examples • The reliability of the expression • s • is • Pr( Ss) • The reliability of the expression • let(r) where r :Î { s | t } • is • Pr( Ss) + Pr( Fs) * Pr( St) European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
2 1 0 0 1 1 0 Fs Ss St St Ft Ft Markov Chain for r :Î { s | t } European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies
Current and Future Work • Continue development of a framework for estimating the reliability of general ORC expressions. • Experiment with, for example, the GRID'5000 testbed to compare empirical results with reliability theory. • Integrate the work on ORC with the ASSIST/muskel models of the group at Pisa by recasting manager features of the latter in ORC. European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies