Processor Allocation: Strategies and Implementation in Distributed Systems

PROCESSOR ALLOCATION Presented by, Wiwek Deshmukh Csc 8320 Instructor: Dr.Yanqing Zhang

Outline • Allocation Models • Design Issues • Implementation Issues • Example Allocation Algorithms • Processor Allocation in Parallel Databases • References

Processor Allocation • A distributed system consists of multiple processors. Eg: Personal Workstations, A public processor pool, Hybrid Systems. • Some algorithm is required to decide which process should run on which processor.

Allocation Models, Assumptions & Goals • All machines are identical, or atleast code-compatible. • The system is fully interconnected. • When is new work created ? • Processor Allocation Strategies : * Non-Migratory * Migratory - Allow better load balancing but are more complex.

Allocation Models, Assumptions & Goals • What are we trying to optimize ? (may differ for different systems) • CPU Utilization • Mean Response Time • Response ratio - Which is better : 1-sec job that takes 5 secs 1-min job that takes 70 secs.

Design Issues • Deterministic vs. Heuristic Algorithms. • Centralized vs. Distributed Algorithms. • Optimal vs. Suboptimal Algorithms. • Local vs. Global Algorithms. (Transfer Policy) • Sender-initiated vs. Receiver-initiated Algorithms. (Location Policy)

Design Issues (Contd …) • Deterministic algorithms appropriate when everything about process behavior is known in advance. • Armed with this information, it is possible to make a perfect assignment. • Today’s work is just like Yesterday’s (statistically) • When system load is completely unpredictable – we use ad hoc techniques or heuristics.

Design Issues (Contd …) • Centralized vs. Distributed Algorithms. • Scalability and Robustness of Centralized Algorithms. • Centralized Algorithms have been proposed for the lack of suitable decentralized alternatives.

Design Issues (Contd …) • Optimal vs. Suboptimal Algorithms. • Are we trying to find the best solution or an acceptable one ? • Optimal solutions can be obtained in both centralized and decentralized systems. • Optimal solutions are more expensive.

Design Issues (Contd …) • Local vs. Global Algorithms. • What is the transfer policy ? • The choice here is whether or not to base the transfer decision entirely on local information.

Design Issues (Contd …) • Sender-initiated vs. Receiver-initiated Algorithms. • What is the location policy ? • Location policy cannot be local. • Who takes the initiative in locating more CPU cycles ?

Implementation Issues • How do we measure load ? • How do we measure CPU utilization ? • What happens when the kernel executes critical code ? • You may end up underestimating the true CPU usage.

Implementation Issues (Contd …) • How is overhead dealt with ? • Many algorithms ignore the overhead caused by themselves. • If system performance improves only by 10% , it may not be worth it. • The cost of moving the process may eat up all the gain. • The algorithm must consider CPU time, memory usage, and network bandwidth consumed by itself.

Implementation Issues (Contd …) • What is the complexity of the software ? • Eager et al. (1986) studied three algorithms. (Related to location policy) • Algo-1 : Randomly send to any machine. • Algo-2 : Randomly probe any machine. • Algo-3 : Probe k machines. • The gain in performance of Algo-3 over Algo-2 is only marginal. • Conclusion of Eager.

Implementation Issues (Contd …) • Does the stability of the system effect performance ? • Machines run their algorithms asynchronously from one another. • The system is rarely in equilibrium. • What happens when tables are still being updated ?

A Graph-Theoretic Deterministic Algorithm • The goal is to minimize network traffic. • For systems consisting of processes with known CPU and memory requirements. • A known matrix. (Average traffic between each pair of processes) • An assignment to be done for the case: no. of CPUs k < No. of processes for the given goal.

A Graph-Theoretic Deterministic Algorithm (Contd … ) • The system is represented as a weighted graph. • Find a way to partition the graph into k disjoint sub-graphs which meet the constraints. • (eg: Total CPU usage & Memory requirements below some limit for each sub-graph.) • Each sub-graph represents 1 processor.

A Graph-Theoretic Deterministic Algorithm (Contd … ) USING THE GREEN CUTS : TOTAL N/W TRAFFIC = 30 (13+17) USING THE RED CUTS : TOTAL N/W TRAFFIC = 28 (13+15) 3 3 2 2 2 1 8 5 6 4 4 5 3 1 4 2 • Look for Clusters that are tightly coupled.

A Centralized Algorithm (using Heuristic) • A heuristic algorithm does not require any advance information. • Up-Down Algorithm (Mutka and Livny, 1987) • A coordinator maintains a usage table with one entry per workstation. (initially zero) • Allocation decisions are based on the table.

A Centralized Algorithm (using Heuristic) Contd … • This algorithm is designed to distribute computing power more fairly rather than optimizing CPU usage. • When a machine needs a processor, it asks the coordinator. • If no processor is available, the request is denied and a note is made in the table entry.

A Centralized Algorithm (using Heuristic) Contd … • When a machine runs processes on other people’s machines, it accumulates penalty points. (added to table entry) • When it has pending requests, penalty points are subtracted. • Usage table entry can be zero, positive or negative. • Positive score: The workstation is a net user. • Negative score: The workstation needs resources.

A Centralized Algorithm (using Heuristic) Contd … • Heuristic Used: When a processor is free, it is assigned to the workstation having a pending request and a table entry with lowest score. • Thus the intention of the algorithm is to allocate capacity fairly.

A Centralized Algorithm (using Heuristic) Contd … • Disadvantages: 1. Scalability 2. Single point of failure. • Solution: Try to reduce the centralization. or Try semi-centralized techniques.

A Hierarchical Algorithm • These algorithms retain the simplicity of centralized algorithms but scale better. • The processors are organized in a logical hierarchy independent of the physical structure of the network. • Eg: Hierarchy as in an academic institution.

A Hierarchical Algorithm (Contd …) • For each group of k machines, we assign one manager machine. (Dept. head) • If there are too many Dept. heads., create another level on top. (Deans) • The no. of levels increase logarithmically with the no. of workers.

A Hierarchical Algorithm (Contd …) • Avoid a single (vulnerable) manager at the top. • Truncate the tree at the top to form a committee with ultimate authority. • When a ruling committee member crashes, other members promote someone one level down as replacement.

A Hierarchical Algorithm (Contd …) • The system is self-repairing. • Jobs can be created at any level of the hierarchy. • Job with S processes. Allocate R ≥ S processors. • R must be large enough but not too large.

A Hierarchical Algorithm (Contd …) • Disadvantages: • Multiple requests are likely to be in various stages of the allocation algorithm. • Potential out-of-date estimates. • Race conditions & Deadlocks could occur.

A Sender-Initiated Distributed Heuristic Algorithm • Typical algorithms are the ones described by Eager et. al. (1986) • According to their results., the most cost effective algorithm was Algo-2. • What happens under conditions of heavy load ?

A Receiver-Initiated Distributed Heuristic Algorithm • A complementary algorithm is one initiated by an underloaded receiver. • When a process finishes., a workstation checks to see if it has enough work. • If not, it begins to randomly probe other machines for work. • If work is not found within N probes., it stops temporarily, and begins again after the next process finishes.

Comparison of the 2 Distributed Heuristic Algorithms SENDER INITIATED: • What happens under conditions of heavy load ? • Futile attempts made to offload. RECEIVER INITIATED: • What happens under conditions of light load ? • Futile attempts made to get work.

Comparison of the 2 Distributed Heuristic Algorithms (Contd … ) • Suggested Improvements: • Try to combine both the algorithms. • Avoid random polling by keeping a history of past probes. (Try to determine chronic underloaded or overloaded machines).

A Bidding Algorithm • Proposed by Ferguson et. al., 1988 • Follows the economics model. • Buyers and Sellers of services and prices set by supply and demand. • The key players are: Processes which buy CPU time & Processors which auction their cycles off to the highest bidder.

A Bidding Algorithm (Contd … ) • Each processor advertises its approximate price by putting it in a publicly readable file. • This price is what the last customer paid. • Different processors have different prices. (Eg: According to speeds., memory size, Math Co-processors (Floating point hardware etc.) • Services offered may also be published: Eg: Expected Mean response time

A Bidding Algorithm (Contd … ) • When a process wants to start a child process, it checks: • Which processors are offering the services that it needs & which processors can it afford. • It then computes the best candidate and makes a bid. • All processors collect all bids and execute the process which made the highest bid. • The published price of the processor is then updated to reflect the going rate.

A Bidding Algorithm (Contd … ) • Problems and Potential Questions that arise: • Where do processes get money from ? • What is money ? • Is disk space also chargeable ? • How about laser printer output ? and the list goes on.

An Intelligent Agent for Adaptive Processor Allocation in Parallel Databases BLOCKED READY Query Host Processor 1 Processor 2 Processor n db1 db2 dbn

An Intelligent Agent for Adaptive Processor Allocation in Parallel Databases PHASE-BASED ALLOCATION: • The query operations are processed in a number of phases. • A set of operations are chosen and allocated to the processors in such a way that the operations complete at approximately the same time. • When all processors finish., the next phase starts.

An Intelligent Agent for Adaptive Processor Allocation in Parallel Databases NON PHASE-BASED ALLOCATION: • In contrast, this attempts to execute the next awaiting operation as soon as a processor becomes idle. • The awaiting operation(s) is allocated to currently idle processor(s) without waiting for others to complete.

An Intelligent Agent for Adaptive Processor Allocation in Parallel Databases OBSERVATION: • The phase-based algorithm performs better under light load conditions. (Eg: Load < 0.7) • For a specific load condition, a cross-over point exists in terms of the load. • This can be determined using Supervised Learning. • An intelligent decision can be made as to choose which algorithm under which load condition.

References • Distributed Operating Systems, A.S.Tanenbaum. • Distributed Operating Systems & Algorithms, Randy Chow & Theodore Johnson. • http://ieeexplore.ieee.org • Processor Allocation Policies for Message-Passing Parallel Computers, Catherine Mccann, Measurement and Modeling of Computer Systems (1994) • An intelligent agent for adaptive processor allocation in parallel databases, Lin, K.H.; Jiang, Y.; Leung, C.H.C.; IEEE International Conference on Intelligent Processing Systems, 1997. ICIPS '97. 1997

Processor Allocation: Strategies and Implementation in Distributed Systems

Processor Allocation: Strategies and Implementation in Distributed Systems

Presentation Transcript

Processor

Processor

Performance-Driven Processor Allocation

Processor

Processor

PROCESSOR

Processor

Processor

Processor

Communication-Aware Processor Allocation for Supercomputers

Processor

Dynamic Processor Allocation for Adaptively Parallel Jobs

Allocation

Processor Co-Allocation in Multicluster Systems

Processor

Processor