280 likes | 434 Views
Static Process Schedule. Csc8320 Chapter 5.2 Yunmei Lu 2011-10-03. Outline. Definition and Goal Models Precedence process model Communication system model Future work Reference. What is Static Process Schedule?(SPS).
E N D
Static Process Schedule Csc8320 Chapter 5.2 Yunmei Lu 2011-10-03
Outline • Definition and Goal • Models • Precedence process model • Communication system model • Future work • Reference
What is Static Process Schedule?(SPS) • Scheduling a set of partially ordered tasks on a non-preemptivemultiprocessor system of identical processors to minimize the overall finishing time (makespan)[1]
Implications ? • Mapping of processes to processors is determined before execution of a process. • Process behavior, process execution time, precedence relationships, and communication patterns need to be known before execution • Non-preemptive, once started, process stays on processor until completed.
Goal ? • Minimize the overall finish time (makespan) on a non-preemptive multiprocessor system (of identical processors) • Scheduling algorithm that can best balance and overlap computation and communication
Other Characteristics? • Optimize makespan NP-complete • Need approximate or heuristic algorithms… • For classical definition, inter-processor communication is considered to be negligible, but for distributed system, it is non-negligible.
Models? • Precedence Process Model(PPM) • Communication Process Model(CPM)
PrecedenceProcessModel(PPM) • Program is represented by a directed acyclic graph (DAG) (Figure a in following slide). • Precedence constraints among tasks in a program are explicitly specified. • It can be characterized by a communication system model showing unit communication delays between processors (Figure b in follow slide). • The communication cost between two tasks = unit communication cost in the communication system graph multiply the message units in the DAG.
Example of DAG • In figure a, each node denotes a task with a known execution time • An edge represents a precedence relationship between two tasks, arrow represents the priority of execution; • The label show message units to be transferred [Chow and Johnson 1997]
Precedenceprocessand communication system models Communication cost between A(on p1) and E(on p3) is: 4*2=8 [Chow and Johnson 1997] Figure b is an example of a communication system model with three processors(p1,p2,p3), the unit communication costs are non-negligible for inter-processor communication and negligible (zero weight on the internal edge) for intra-processor communication.
Precedence Process Model • Algorithms: • List Scheduling (LS): A simple greedy heuristic: no processor remains idle if there are some tasks available that it could process. Without considering communication. • Extended List Scheduling (ELS): the actual scheduling results of LS with communication consideration. • Earliest Task First scheduling (ETF): the earliest schedulable task (with communication delay considered) is scheduled first. [Chow and Johnson 1997]
Algorithms (critical path): is the longest execution path in the DAG [Chow and Johnson 1997] Dashed-lines represent waiting for communication
Communication Process Model(CPM) • Modeled by a undirected graph G, nodes represent processes and weight on the edge is the amount of communication messages between two connected processes. • There are no precedence constrains among processes • Processors are not identical(different in speed and hardware) • Scheduling goal: maximize the resource utilization and minimize inter-process communication. [Chow and Johnson 1997]
Communication Process Model • The problem is to find an optimal assignment of m processes to P processors with respect to the target function(called Module Allocation problem): • P: a set of processors. • ej(pi): computation cost of execution process pj in processor Pi. • ci,j(pi,pj): communication overhead between processes piand pj. • Assume a uniform communicating speed between processors. [Chow and Johnson 1997]
Communication Process Model • Stone’s two processors model to achieve minimum total execution and communication cost: [Chow and Johnson 1997] (a) Computation cost Figure (a) shows execution time of each process on either processor, (b) shows inter-process communication
How to map processes to processors? • Partition the graph by drawing a line cutting through some edges • Result in two disjoint graphs, one for each process • Set of removed edges cut set • Cost of cut set sum of weights of the edges, which represents the total inter-process communication cost between processors • The cost of cut sets is 0 if all processes are assigned to the same node, but it makes no sense • Computation constraints (no more k, distribute evenly…) [Chow and Johnson 1997]
How to map process to processors? Minimum-cost cut The weight assigned to an edge between A and process i is the cost to execute process i on B. [Chow and Johnson 1997]
Extend of stone’s two processors model • To generalize the problem beyond two processors, Stone uses a repetitive approach based on two-processor algorithm to solve n-processor problems. • Treat (n-1) processors as one super processor • The processors in the super-processor are further broken down based on the results from previous step. [Chow and Johnson 1997]
Problems? • Too complex • The optimization objectives of minimizing computation and communication costs are often conflicting • Therefore, we use some other heuristic solutions
Some heuristic solutions [Chow and Johnson 1997]
Problem and Solution • Merging processes eliminates inter-processor communication but may impose a higher computation burden on the processor and thus reduce concurrency. • Solution • Merge only processes with communication costs higher than a certain threshold C • Constrain the number of processes in a cluster, like that total execution cost of the processes in a single cluster cannot exceed another threshold X [Chow and Johnson 1997]
Cluster of processes • For C = 9, We get three clusters (2,4), (1,6 )and (3,5) • Clusters (2,4) and (1,6) must be mapped to processors A and B. • Cluster (3,5) can be assigned to A 0r B, according to the goal of minimizing of computation cost or communication cost • Assigning (3,5) to A has a lower communication cost but higher computation cost • If we assign (3,5) to A, the total Cost = 41 ( Computation cost = 17 on A and 14 on B Communication cost = 6+4= 10) 6 (2,4)) (1,6)) 8 4 [Chow and Johnson 1997] (3,5))
Summary of static process schedule • Non-preemptive, once a process is assigned to a processor, it remain there until its execution has been completed • Need prior knowledge of execution time and communication behavior • Scheduling decision is centralized and non-adaptive • Not effective • Not realistic • To find the optimal solution is NP-hard, so always use heuristic algorithms.
Future work • With the advancements in processors and networking hardware technologies, parallel processing can be accomplished in a wide spectrum of platforms. Designing diverse platforms makes the scheduling problem even more complex and challenging. • Designing scheduling algorithms for efficient parallel processing, should consider the following aspects:
Cont… • Performance: Scheduling algorithm should produce high quality solution • Time-complexity: It is an important factor insofar as the quality of solution is compromised. A fast algorithm is necessary for finding good solutions efficiently. • Scalability: Have to consistently give good performance even for large input. Given more processors for a problem, the algorithm should produce solutions with comparable quality in a shorter period of time. • Applicability: Must be applicable in practical environments, so it should take into account realistic assumptions about the program and multiprocessor models such as arbitrary computation and communication weights…
Cont… • The above mentioned goals are conflicting and thus pose a number of challenges to researchers. • To combat these challenges, several new ideas are: • Genetic algorithms • Randomization approaches • Parallelization techniques • Extend DAG scheduling to heterogeneous computing platforms
Reference • Randy Chow, Theodore Johnson, “Distributed Operating Systems & Algorithms”, Addison Wesley, pp.156-163. • Yu-Kwong Kwok, Ishfaq Ahmad; Static scheduling algorithms for allocating directed task graphs to multiprocessors; ACM Computing Surveys; December 1999 • Sachi Gupta, GauravAgarwal, Vikas Kumar “Task Scheduling in Multiprocessor System Using Genetic Algorithm”,10.1109/ICMLC.2010.50 • HongzeQiu, Wanli Zhou, Hailong Wang. “A Genetic Algorithm-based Approach to Flexible Job-shop Scheduling Problem” . DOI 10.1109/ICNC.2009.609 • Xueyan Tang & Samuel T. Chanson. “ Optimizing Static Job Scheduling in a Network of Heterogeneous Computers”. pp 373- 382, icpp, IEEE 2000