460 likes | 473 Views
An analysis of a climate forecast application's needs and the development, testing, and comparison of scheduling heuristics for efficient execution. Provides generic scheduling schemes for similar applications.
E N D
Scheduling for a Climate Forecast Application Andreea Chisunder the guidance of Frédéric Desprez and Eddy Caron ANR-05-CIGC-11
3 1 4 2 5 Scheduling Heuristics Introduction Simulation Results Related Works Conclusions and Future Works Contents
3 1 4 2 5 Scheduling Heuristics Introduction Experimental Results Related Works Conclusions and Future Works Contents
General Purpose • Context : global warming and climate fluctuations • Numerical simulations using general circulation models of a climate system • atmosphere • ocean • continental surfaces • Climatologists’ purpose • estimate global warming simulations’ sensitivity with respect to the model’s parameterization • Climate forecast application provided by CERFACS within the LEGO project Introduction
Our Goal • Analyze the application • Model its needs • Execution model • Data access pattern • Computing needs • Elaborate, test and compare appropriate scheduling heuristics • Provide generic scheduling schemes for applications with similar dependence graphs Introduction
Application Description • “Scenario” simulations • current climate followed by 21st century for 150 years (1800 months) • different parameterization of atmospheric model Introduction
Application Description • One monthly simulation : concatenate_atmospheric_input_files(1) modify_parameters(1) • atmospheric model (ARPEGE) • ocean and sea-ice model (OPA) • runoff pathway (TRIP) • coupler (OASIS) process_coupled_run convert_output_format(60) compress_diagonals(30) extract_minimun_information(30) Introduction
Application Description Introduction
3 1 4 2 5 Scheduling Heuristics Introduction Experimental Results Related Works Conclusions Contents
Related Works • Multiple DAGs Scheduling • Mixed Parallelism • Pipelined Data Parallel Tasks Related Works
Multiple DAGs Scheduling • Directed Acyclic Graph (DAG) • Nodes – tasks • Edges – precedence constraints • Multiple DAGs Scheduling Related Works
Multiple DAGs Scheduling • Composite DAG Related Works
Multiple DAGs Scheduling • Group DAGs’ tasks in levels of independent tasks Related Works
Related Works – Multiple DAGs Scheduling • Composite DAG and round-robin policy of scheduling among DAGs • Composite DAG & ranking based composition Related Works
Mixed Parallelism • Parallel scientific application • Data parallelism • Task parallelism • Mixed parallelism • Scheduling a DAG on a finite number of resources – NP complete even for the simple case of mono-processor tasks • Heuristic approaches Related Works
Mixed Parallelism • A. Radulescu & A. Gemund (2001) – 2 step heuristic - CPA (Critical Path and Area based Scheduling) • Processors allocation to tasks - based on a compromise between the critical path length and the processor utilization • Task allocation on processors - list scheduling heuristic Related Works
Pipelined Data Parallel Tasks • Computations consisting of a chain of data-parallel tasks that process successive data sets in a pipeline fashion – particular case of mixed parallelism • 2 key metrics to be optimized: • Latency- duration of processing a data-set • Throughput- rate at which data sets can be processed Related Works
Related Works – Pipelined Data Parallel Tasks • Aspects to be considered : • Clustering of successive stages into modules • Reduces communications • Improves latency • Replicating modules • Improves throughput • Increases latency Related Works
3 1 4 2 5 Scheduling Heuristics Introduction Experimental Results Related Works Conclusions Contents
Scheduling Heuristics • Climate Application Scheduling • Generic Scheduling Heuristics Scheduling Heuristics
Climate Application Scheduling • Homogeneous platform composed of R resources • Communication assumed contention-free through NFS • Tasks execution time is assumed to include the necessary time to • access the data • redistribute it to processors • effective computing time • store back the data Scheduling Heuristics
Main processing Post processing Climate Application Scheduling concatenate_atmosferic_input_files(1) modify_parameters(1) process_coupled_run convert_output_format(60) compress_diagonals(30) extract_minimun_information(30) Scheduling Heuristics
Climate Application Scheduling • We divide processors into disjoint sets on which multi-processor tasks can execute • All multi-processor tasks execute on the same number of resources G, defining a certain grouping of resources • For the given application, 8 possible values for the parameter G (4 →11) Scheduling Heuristics
Climate Application Scheduling • Case 1 • Case 2 Scheduling Heuristics
Climate Application Scheduling • The makespan is computed analytically as a function of • number of resources R; • grouping G ; • number of months in an independent simulation (NM) • number of independent simulations (NS). • The grouping G yielding the smallest makespan is chosen Scheduling Heuristics
Climate Application Scheduling • The constraint of scheduling all multi-processor tasks on the same number of resources is tight • Eg. R=53, NS=10, NM=1800, • found optimal grouping G = 7; • 49 resources for main processing; • 1 resource used for the corresponding post-processing • 3 resources unused. • however, 3 groups with 8 resources and 4 groups with 7 resources – 4.5% of gain Scheduling Heuristics
Climate Application Scheduling • Possibilities for improvement : • Heuristic 1 • distribute evenly the unused resources among the existing groups • Heuristic 2 • use all resources for multi-processor tasks (evenly distributing the extra-resources among processor groups) • all post-processing at the end • Heuristic 3 • use all resources for multi-processor tasks and model the problem as an instance of the knapsack problem • all post-processing at the end Scheduling Heuristics
Climate Application Scheduling • Knapsack problem modelization • Items – the 8 possibilities (groupings of resources) for allocating processors to multi-processor tasks (4 → 11) • Cost of an item – the number of resources of that grouping • Value of a grouping G – 1/T[G] – the fraction of a multi-processor task that gets executed in a time unit on G resources • Unknowns ni (i=4 → 11) – number of groups with i resources in the final solution • Constraints • Goal : maximize Scheduling Heuristics
Climate Application Scheduling Scheduling Heuristics
Generic Scheduling Heuristics • We propose generic scheduling heuristics for a class of applications consisting of independent identical chains of identical DAGs Scheduling Heuristics
Generic Scheduling Heuristics • First approach • Create a composite DAG • link all entry nodes to a common entry node and all exit tasks to a common exit node • Apply mixed parallelism scheduling heuristics on the composite DAG • CPA • reduced complexity (O(V(V+E)R)); • drawback of being a 2 step algorithm. Scheduling Heuristics
Generic Scheduling Heuristics • Second approach • Exploit the knowledge on the specific structure of the application • Exploit the pipelined structure of the application • Separate the independent pre and post-processing tasks and schedule them with algorithms for independent malleable tasks (5/4 approximation in constant time) Scheduling Heuristics
Generic Scheduling Heuristics Scheduling Heuristics
Generic Scheduling Heuristics Scheduling Heuristics
Generic Scheduling Heuristics • Heuristic 1 • Schedule all pre-processing tasks at the beginning • Schedule inter and main processing tasks as interval on the same number of resources • Schedule all post-processing tasks at the end • Heuristic 2 • Schedule all pre-processing tasks at the beginning • Schedule inter and main processing tasks separately as a pipeline • Schedule all post-processing tasks at the end Scheduling Heuristics
Generic Scheduling Heuristics • Heuristic 3 • Schedule inter and main processing tasks as an interval pipeline on the same number of resources • Schedule pre and post processing tasks simultaneously on resources specially reserved for them as well as resources unused by the pipeline • Schedule pre and post-processing tasks left at the beginning and end of pipeline respectively Scheduling Heuristics
Generic Scheduling Heuristics • Heuristic 4 • Schedule inter and main processing tasks separately as a pipeline • schedule pre and post processing tasks simultaneously with the pipeline on resources specially reserved for them as well as resources unused by the pipeline ; • schedule pre and post processing tasks left at the beginning and end of pipeline respectively; Scheduling Heuristics
3 1 4 2 5 Scheduling Heuristics Introduction Simulation Results Related Works Conclusions Contents
Simulation Results • Behavior of the 4 heuristics tested against CPA applied on the composite DAG • Tasks’ execution time modeled by Amdahl’s law: • Several configurations tested Simulation Results
Simulation Results • Configuration 1 • All tasks’ execution time on 1 processor identical (500) • All tasks’ coefficient α is identical (0.1) Simulation Results
Simulation Results • Configuration 2 • Same as before, with αinterprocessing = 0.8 Simulation Results
Simulation Results • Configuration 3 • T1pre-processing= T1post-processing=50, T1main-processing = T1inter-processing=500 • α= 0.1, αinter_processing=0.6 Simulation Results
Simulation Results • Configuration 4 • T1pre-processing= T1post-processing=50, T1main-processing = T1inter-processing=500 • α= 0.1, αinter_processing=1.0 Simulation Results
3 1 4 2 5 Scheduling Heuristics Introduction Experimental Results Related Works Conclusions and Future Works Contents
Conclusions • We found a model for the given real application • We proposed a basic heuristic for this model and 3 improved versions • We proposed 4 pipeline- based heuristics for the generalized problem and compared them with the approach of applying a mixed-parallelism algorithm on the composite DAG of the application Conclusions and Future Works
Future Works • Enhance the heuristics by taking into account a more precise communication model • Perform real experimentations on Grid’5000 in order to validate the theoretical results • Analyze other applications using a similar approach with the long term goal of deriving application dependent scheduling schemes that could finally be implemented as DIET plug-in schedulers Conclusions and Future Works