320 likes | 330 Views
Outline. Distributed scheduling Motivations Design issues Distributed scheduling algorithms. Motivations. In a locally distributed system, there is a good possibility that several computers are heavily loaded while others are idle or lightly loaded
E N D
Outline • Distributed scheduling • Motivations • Design issues • Distributed scheduling algorithms
Motivations • In a locally distributed system, there is a good possibility that several computers are heavily loaded while others are idle or lightly loaded • If we can move jobs around (in other words, distribute the load more evenly), the overall performance of the system can be maximized COP 5611 - Operating Systems
Motivations – cont. COP 5611 - Operating Systems
Motivations – cont. COP 5611 - Operating Systems
Distributed Scheduling • A distributed scheduler is a resource management component of a distributed operating system that focuses on judiciously and transparently redistributing the load of the system among the computers to maximize the overall performance COP 5611 - Operating Systems
Issues in Load Distribution • Load estimation • Queue lengths • CPU utilization • Load distributing algorithms • Static • Dynamic • Adaptive COP 5611 - Operating Systems
Issues in Load Distribution – cont. • Load balancing vs. load sharing • Load sharing tries to reduce the likelihood of an unshared state (where one computer is idle while at the same time others are overloaded) by transferring tasks • Load balancing algorithms attempt to equalize loads at all computers COP 5611 - Operating Systems
Issues in Load Distribution – cont. • Preemptive vs. non-preemptive transfers • Preemptive task transfers involve the transfer of tasks that are partially executed • This transfer is in general expensive as it needs to transfer the entire task state consisting of a virtual memory image, a process control block, unread I/O buffers and messages, file pointers, times that have been set, and so on • Non-preemptive transfers involve the transfer of tasks that have not started yet • Environment transfer COP 5611 - Operating Systems
Components of a Load Distributing Algorithm • Four components • Transfer policy • Determines when a node needs to send tasks to other nodes or can receive tasks from other nodes • Selection policy • Determines which task(s) to transfer • Location policy • Find suitable nodes for load sharing COP 5611 - Operating Systems
Components of a Load Distributing Algorithm – cont. • Four components – continued • Information policy • Demand-driven • Periodic • State-change driven COP 5611 - Operating Systems
Stability • The queuing-theoretic perspective • The CPU queues grow without bound if arrival rate is greater than the rate at which the system can perform work • A load distributing algorithm is effective under a given set of conditions if it improves the performance relative to that of a system not using load distribution • Algorithmic stability • An algorithm is unstable if it can perform fruitless actions indefinitely with finite probability • Processor thrashing COP 5611 - Operating Systems
Sender-Initiated Algorithms • In sender-initiated algorithms, an overloaded node initiates the load distribution • Transfer policy • Selection policy • Location policy • Random • Threshold • Shortest • Information policy COP 5611 - Operating Systems
Sender-Initiated Algorithms – cont. COP 5611 - Operating Systems
Sender-Initiated Algorithms – cont. • Performance analysis • Instability at high system loads • When system loads are high, the sender-initiated algorithms can cause the systems to be unstable • At high system loads, no node is likely to be lightly loaded and the probability that a sender will find a receiver is very low • However, the polling activity increases as the rate at which work arrives increases • Performance at low system loads COP 5611 - Operating Systems
Receiver-Initiated Algorithms • In receiver-initiated algorithms, an under loaded node initiates the load distribution • Transfer policy • Selection policy • Location policy • Information policy COP 5611 - Operating Systems
Receiver-Initiated Algorithms – cont. COP 5611 - Operating Systems
Receiver-Initiated Algorithms – cont. • Performance analysis • At high system loads, the probability of finding a sender is high and thus a sender can find a receiver in a few polls in general • At low system loads, there are few senders but more receiver-initiated polls; these polls do not cause system instability as spare CPU cycles are available • A drawback • Most transfers will be preemptive and thus expensive COP 5611 - Operating Systems
Empirical Comparison of Sender-Initiated and Receiver-Initiated Algorithms COP 5611 - Operating Systems
Symmetrically Initiated Algorithms • In symmetrically initiated algorithms, both senders and receivers search for receivers and senders respectively for task transfers • The above average algorithm • Transfer policy • Location policy • Sender-initiated component • Receiver-initiated component • Selection policy • Information policy COP 5611 - Operating Systems
Symmetrically Initiated Algorithms – cont. • Sender-initiated component • A sender broadcasts a TooHigh message, sets a TooHigh timeout alarm, and listens for an Accept • A receiver that receives a TooHigh message cancels its TooLow timeout, sends an Accept message to the sender, and increases its load value • On receiving an Accept message, if the site is still a sender, choose the best task to transfer and transfer it • If no Accept has been received before the timeout, it broadcasts a ChangeAverage message to increase the average load estimates at the other nodes COP 5611 - Operating Systems
Symmetrically Initiated Algorithms – cont. • Receiver-initiated component • It broadcasts a TooLow message, set a TooLow timeout alarm, and starts listening for a TooHigh message • If TooHigh message is received, it cancels its TooLow timeout, sends an Accept message to the sender, and increases its load value • If no TooHigh message is received before the timeout, the receiver broadcasts a ChangeAverage message to decrease the average at other nodes COP 5611 - Operating Systems
Symmetrically Initiated Algorithms – cont. • Performance analysis • Instability at high system loads • Due to the sender-initiated components COP 5611 - Operating Systems
Comparison COP 5611 - Operating Systems
Adaptive Algorithms • A stable symmetrically initiated algorithm • Each node keeps of a senders list, a receivers list, and an OK list • By classifying the nodes in the system as Sender/overloaded, Receiver/underloaded, or OK using the information gathered through polling COP 5611 - Operating Systems
A Stable Symmetrically Initiated Algorithm – cont. • Sender-initiated component • The sender polls the node at the head of the receiver • The polled node moves the sender to the head of its sender list and sends a message indicating it is a receiver, sender, or OK node • The sender updates the polled node based on the reply • If the polled node is a receiver, it transfers a task • The polling process stops if its receiver’s list becomes empty, or the number of polls reaches a PollLimit COP 5611 - Operating Systems
A Stable Symmetrically Initiated Algorithm – cont. • Receiver-initiated component • The nodes polled in the following order • Head to tail of its senders list • Tail to head in the OK list • Tail to head in the receivers list COP 5611 - Operating Systems
A Stable Sender-Initiated Algorithm • This algorithm uses the sender-initiated algorithm of the stable symmetrically initiated algorithm • Each node is augmented by an array called the statevector • It keeps track of its status at all the other nodes in the system • It is updated based on the information at the polling stage • The receiver-initiated component is replaced by the following protocol • When a node becomes a receiver, it informs all the nodes that are misinformed COP 5611 - Operating Systems
Comparison COP 5611 - Operating Systems
Performance Under Heterogeneous Workloads COP 5611 - Operating Systems
Selecting a Suitable Load Sharing Algorithm • The best algorithm depends on the system under consideration • For example, if the system never attains high loads, sender-initiated algorithms will give an improved algortihm • Stable scheduling algorithms should be used for systems that can reach high loads • For systems with heterogeneous work loads, adaptive stable algorithms are preferable COP 5611 - Operating Systems
Other Requirements of Load Distributing • Scalability • The algorithm should work well in large distributed systems • Location transparency • Determinism • Preemption • Heterogeneity COP 5611 - Operating Systems
Case Studies • The V-System • The Sprite system • Condor system • The Stealth distributed scheduler COP 5611 - Operating Systems