240 likes | 283 Views
Static Process Scheduling. Yi Sun. Overview. Before execution, processes need to be scheduled and allocated with resources Objective Enhance overall system performance metric Process completion time and processor utilization In distributed systems: location and performance transparency
E N D
Static Process Scheduling Yi Sun
Overview • Before execution, processes need to be scheduled and allocated with resources • Objective • Enhance overall system performance metric • Process completion time and processor utilization • In distributed systems: location and performance transparency • In distributed systems • Local scheduling (on each node) + global scheduling • Communication overhead • Effect of underlying architecture • Dynamic behavior of the system
Process Interaction Models • Precedence process model: Directed Acyclic Graph (DAG) • Represent precedence relationships between processes • Minimize total completion time of task (computation + communication) • Communication process model • Represent the need for communication between processes • Optimize the total cost of communication and computation • Disjoint process model • Processes can run independently and completed in finite time • Maximize utilization of processors and minimize turnaround time of processes
Communication overhead Process Models Partition 4 processes onto two nodes
System Performance Model Attempt to minimize the total completion time of (makespan) of a set of interacting processes
System Performance Model (Cont.) • Related parameters • OSPT: optimal sequential processing time; the best time that can be achieved on a single processor using the best sequential algorithm • CPT: concurrent processing time; the actual time achieved on a n-processor system with the concurrent algorithm and a specific scheduling method being considered • OCPTideal: optimal concurrent processing time on an ideal system; the best time that can achieved with the concurrent algorithm being considered on an ideal n-processor system(no inter-communication overhead) and scheduled by an optimal scheduling policy • Si: the ideal speedup by using a multiple processor system over the best sequential time • Sd: the degradation of the system due to actual implementation compared to an ideal system
System Performance Model (Cont.) P1 P2 P3 Pi: the computation time ofthe concurrent algorithm onnode i P4 (RP 1) P1 P3 P1 P2 P4 P2 OCPTideal P3 P4 OCPTideal
System Performance Model (Cont.) (The smaller, the better) (The larger, the better)
System Performance Model (Cont.) • RP: Relative processing (algorithm) • How much loss of speedup is due to the substitution of the best sequential algorithm by an algorithm better adapted for concurrent implementation but which may have a greater total processing need • Loss of parallelism due to algorithm conversion • Increase in total computation requirement • Sd • Degradation of parallelism due to algorithm implementation • RC: Relative concurrency (algorithm?) • How far from optimal the usage of the n-processor is • RC=1 best use of the processors • Theoretic loss of parallelism • : loss of parallelism when implemented on a real machine (system architecture + scheduling)
Efficiency Loss Impact factors: scheduling, system, and communication
Workload Distribution • Performance can be further improved by workload distribution • Loading sharing: static workload distribution • Dispatch process to the idle processors statically upon arrival • Corresponding to processor pool model • Load balancing: dynamic workload distribution • Migrate processes dynamically from heavily loaded processors to lightly loaded processors • Corresponding to migration workstation model • Model by queuing theory: X/Y/c • Proc. arrival time distribution:X; Service time distribution:Y; # of servers: c • : arrival rate; : service rate; : migration rate • : depends on channel bandwidth, migration protocol, context and state information of the process being transferred.
Processor-Pool and Workstation Queueing Models Static Load Sharing Dynamic Load Balancing M for Markovian distribution
Comparison of Performance for Workload Sharing (Communication overhead) (Negligible Communication overhead)
Static Process Scheduling • Static process scheduling: deterministic scheduling policy • Scheduling a set of partially ordered tasks on a non-preemptive multi-processor system of identical processors to minimize the overall finishing time (makespan) • Optimize makespan NP-complete • Need approximate or heuristic algorithms… • Attempt to balance and overlap computation and communication • Mapping processes to processors is determined before the execution • Once a process starts, it stays at the processor until completion • Need prior knowledge about process behavior (execution time, precedence relationships, communication patterns) • Scheduling decision is centralized and non-adaptive
Precedence Process and Communication System Models Communication overhead for A(P1) and E(P3)= 4 * 2 = 8 Communication overhead for one message Execution time No. of messagesto communicate
Precedence Process Model • Precedence Process Model – NP-complete • A program is represented by a DAG (Figure 5.5 (a)) • Node: task with a known execution time • Edge: weight showing message units to be transferred • Communication system model: Figure 5.5 (b) • Scheduling strategies • List Scheduling (LS): no processor remains idle if there are some tasks available that it could process (no communication overhead) • Extended List Scheduling (ELS): LS first + communication overhead • Earliest Task First (ETF) scheduling: the earliest schedulable task is scheduled first • Critical path: longest execution path • Lower bound of the makespan • Try to map all tasks in a critical path onto a single processor
Communication Process Model • Communication process model • Maximize resource utilization and minimize inter-process communication • Undirected graph G=(V,E) • V: Processes • E: weight = amount of interaction between processes • Cost equation • e = process execution cost (cost to run process j on processor i) • C = communication cost (C==0 if i==j) • Again!!! NP-Complete
Stone’s two-processor model to achieve minimum total execution and communication cost • Example: Figure 5.7 (Don’t consider execution cost) • Partition the graph by drawing a line cutting through some edges • Result in two disjoint graphs, one for each process • Set of removed edges cut set • Cost of cut set sum of weights of the edges • Total inter-process communication cost between processors • Of course, the cost of cut sets is 0 if all processes are assigned to the same node • Computation constraints (no more k, distribute evenly…) • Example: Figure 5.8 (Consider execution cost) • Maximum flow and minimum cut in a commodity-flow network • Find the maximum flow from source to destination
Minimum-Cost Cut Only the cuts that separate A and Bare feasible
Discussion – Static Process Scheduling • Once a process is assigned to a processor, it remain there until its execution has been completed • Need prior knowledge of execution time and communication behavior • Not realistic
Reference • Distributed Operating Systems & Algorithms, by Randy Chow and Theodore Johnson, 1997