220 likes | 234 Views
This study focuses on improving the performance of climate prediction applications by modeling and scheduling tasks in the DIET framework. The study examines scheduling at both the cluster and grid level, and explores the use of heuristics for workflow management and parallel task execution.
E N D
WP3 Étude et modélisation de l’ordonnancement et du déploiement des applications 7/5/2009
WP3: Overview • Les tâches • T3.1: Etude et modélisation • T3.2: Mise en œuvre • Bilan
Ordonnancement E. Caron - Réunion #11 - 7/5/09
Ocean-Atmosphere scheduling within DIET • Improve performances in a climate prediction application • Modelization of the application • Proof of usage of Grid’5000 and DIET • Scheduling on real application • Scheduling done at two levels • Groups of processors at cluster level • Distribution of scenarios at grid level • Real implementation suffered from technical limitations • Simulations are quite precise but we need to keep one resource for post-processing tasks E. Caron - Ocean-Atmosphere scheduling within DIET - APDCT-08
Cluster Level Scheduling – Experimental result Experiment: 10 scenarios, 5 clusters, from 11 to 112 resources Every resource is taken into account Average makespan is strictly decreasing when adding more resources The decrease rate of the average makespan diminishes E. Caron - Ocean-Atmosphere scheduling within DIET - APDCT-08
Result Grid Level Scheduling Comparison with Round Robin on 5 clusters Maximum speedup: 25% With a higher load, the algorithm behaves better with a few resources Convergence on gains Gain of 25% ≈ 230h on a ≈ 822h long experiment E. Caron - Ocean-Atmosphere scheduling within DIET - APDCT-08
Workflow Management • Workflow representation • Direct Acyclic Graph (DAG) • Each vertex is a task • Each directed edge represents communication between tasks • Goals • Build and execute workflows • Use different heuristics to solve scheduling problems • Extensibility to address multi-workflows submission and large grid platform • Manage heterogeneity and variability of environment • Research topics addressed • Workflow scheduling with parallel tasks • Multiple workflows scheduling (makespan minimization and fairness optimization)
Architecture with MA DAG • Specific agent for workflow management (MA DAG) • Two modes: • MA DAG defines a complete scheduling of the workflow (ordering and mapping) • MA DAG defines only an ordering for the workflow execution, the mapping is done in the next step by the client which pass by the Master Agent to find the server where execute the workflow services. • Design of heuristics for mixed parallelism
Parallel and batch submissions agent SeD_parallel agent SeD_batch SeD_seq GLUE OAR SGE LSF PBS Loadleveler • Parallel & sequential jobs→ transparent for the user • Submit a parallel job→ system dependent • NFS: copy the code? • Numerous batch systems • Batch schedulers behaviour(queues, scripts, etc.) • Information about theinternal scheduling process • Monitoring& Performance prediction • Simulation (Simbatch) Front-end NFS
Task reallocation in a grid environment • Batch simulator: Simbatch [Y. Caniou, J.S. Gay] • Validated against OAR (less than 2% error) • Based on Simgrid • Grid simulated by Simgrid with several Simbatch instances • Different algorithms studied: MCT MinMin MaxMin on batchs using FCFS or CBF - MaxMin and MinMin can not try to reschedule more than 30 jobs at each reallocation (choose oldest or youngest jobs) • Comparison of jobs completion time with and without reallocation
Task reallocation in a grid environment Reschedule Meta - Scheduler Submit Cancel Get waiting jobs Batch 1 Batch 2 Traces of one month from Grid’5000 (january to june 2008) on three sites Reallocation triggered every hour
Déploiement E. Caron - Réunion #11 - 7/5/09
Deployment Brick: ADAGE • Automatic deployment tool for grid environment • Only one command to deploy • 3 kinds of input information • Resource description • application description • control parameter • Planning model (random, round-robin), … • Plug-in for generic application mapping • RR, Random, DIET, Graal-heuristics • Plug-in for each application kind • Description convector • Configuration of application • CCM, MPI, JXTA, P2P, DIET, GFARM, SEQ • Plug-in: from 400 to 4700 C++ lines • META • Enable constraints between any other application kinds (at the generic level)
Identification of the steps of Automatic Deployment MPI Application Description CCM Application Description Resource Description Generic Application Description Control Parameters Deployment Planning Static Applications Deployment Tool Deployment Plan Execution Application Configuration
Comparaison GoDIET / Adage • Déploiement sur Grid’5000 : • Entre 25 et 305 nœuds • Entre 1 et 8 grappes • Heuristique pour créer automatiquement la hiérarchie DIET
Comparaison GoDIET / Adage Hiérarchie DIET générée
ADAGE & LEGO • Clean implementation of the Adage model • UML-like based specifications • Separations of planner and application plugins from core • Extension of the internal generic model (GADe) • Support of graph-like generic description • In particular recursive structures like trees (for DIET) • Support of pseudo-dynamic re-deployment • Support of the G5K API • Working and stable tool • Use to deploy CCM, JuxMem & DIET elements • Cf Demonstrator talk
Grid'5000 Reservation Utility for Deployment Usage • Web: http://grudu.gforge.inria.fr
GRUDU – Resources Allocation We are able to reserve ressources (OAR1 & OAR2) Time parameters, date and reservation walltime Queue OARGrid sub behaviour/ Script to launch
GRUDU – Monitoring We are able to monitor the status of the grid/site/a job. We are able to get instantaneous/historical data with Ganglia
GRUDU - KaDeploy/JFTP GUI for KaDeploy jobs deployment File Transfert interface (local<->remote/rsync on Grid'5000)
WP3: Bilan • Réalisations principales • Délivrable D3.2 • ADAGE • DIET 2.3 • GRUDU • Perspectives • Prise en compte automatique de la plateforme pour le planning • Clustering auto-stabilisant • ADAGE ? • Utilisation de l’ordonnancement de l’application du CERFACS pour un modèle régional atmosphérique: CRIP UJF (IMAG. Grenoble) • D’autres classes d’applications à ordonnancer • Ordonnancement et gestion de données: création et utilisation de DAGDA • Collaborations • Salomé (EDF) [thèse en cours] • Université de Picardie Jules Verne • Université du Nevada Las Vegas • Université d’Hawaii