1 / 22

Ocean-Atmosphere Scheduling in DIET: Improving Performance in Climate Prediction Applications

This study focuses on improving the performance of climate prediction applications by modeling and scheduling tasks in the DIET framework. The study examines scheduling at both the cluster and grid level, and explores the use of heuristics for workflow management and parallel task execution.

dhelsley
Download Presentation

Ocean-Atmosphere Scheduling in DIET: Improving Performance in Climate Prediction Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WP3 Étude et modélisation de l’ordonnancement et du déploiement des applications 7/5/2009

  2. WP3: Overview • Les tâches • T3.1: Etude et modélisation • T3.2: Mise en œuvre • Bilan

  3. Ordonnancement E. Caron - Réunion #11 - 7/5/09

  4. Ocean-Atmosphere scheduling within DIET • Improve performances in a climate prediction application • Modelization of the application • Proof of usage of Grid’5000 and DIET • Scheduling on real application • Scheduling done at two levels • Groups of processors at cluster level • Distribution of scenarios at grid level • Real implementation suffered from technical limitations • Simulations are quite precise but we need to keep one resource for post-processing tasks E. Caron - Ocean-Atmosphere scheduling within DIET - APDCT-08

  5. Cluster Level Scheduling – Experimental result Experiment: 10 scenarios, 5 clusters, from 11 to 112 resources Every resource is taken into account Average makespan is strictly decreasing when adding more resources The decrease rate of the average makespan diminishes E. Caron - Ocean-Atmosphere scheduling within DIET - APDCT-08

  6. Result Grid Level Scheduling Comparison with Round Robin on 5 clusters Maximum speedup: 25% With a higher load, the algorithm behaves better with a few resources Convergence on gains Gain of 25% ≈ 230h on a ≈ 822h long experiment E. Caron - Ocean-Atmosphere scheduling within DIET - APDCT-08

  7. Workflow Management • Workflow representation • Direct Acyclic Graph (DAG) • Each vertex is a task • Each directed edge represents communication between tasks • Goals • Build and execute workflows • Use different heuristics to solve scheduling problems • Extensibility to address multi-workflows submission and large grid platform • Manage heterogeneity and variability of environment • Research topics addressed • Workflow scheduling with parallel tasks • Multiple workflows scheduling (makespan minimization and fairness optimization)

  8. Architecture with MA DAG • Specific agent for workflow management (MA DAG) • Two modes: • MA DAG defines a complete scheduling of the workflow (ordering and mapping) • MA DAG defines only an ordering for the workflow execution, the mapping is done in the next step by the client which pass by the Master Agent to find the server where execute the workflow services. • Design of heuristics for mixed parallelism

  9. Parallel and batch submissions agent SeD_parallel agent SeD_batch SeD_seq GLUE OAR SGE LSF PBS Loadleveler • Parallel & sequential jobs→ transparent for the user • Submit a parallel job→ system dependent • NFS: copy the code? • Numerous batch systems • Batch schedulers behaviour(queues, scripts, etc.) • Information about theinternal scheduling process • Monitoring& Performance prediction • Simulation (Simbatch) Front-end NFS

  10. Task reallocation in a grid environment • Batch simulator: Simbatch [Y. Caniou, J.S. Gay] • Validated against OAR (less than 2% error) • Based on Simgrid • Grid simulated by Simgrid with several Simbatch instances • Different algorithms studied: MCT MinMin MaxMin on batchs using FCFS or CBF - MaxMin and MinMin can not try to reschedule more than 30 jobs at each reallocation (choose oldest or youngest jobs) • Comparison of jobs completion time with and without reallocation

  11. Task reallocation in a grid environment Reschedule Meta - Scheduler Submit Cancel Get waiting jobs Batch 1 Batch 2 Traces of one month from Grid’5000 (january to june 2008) on three sites Reallocation triggered every hour

  12. Déploiement E. Caron - Réunion #11 - 7/5/09

  13. Deployment Brick: ADAGE • Automatic deployment tool for grid environment • Only one command to deploy • 3 kinds of input information • Resource description • application description • control parameter • Planning model (random, round-robin), … • Plug-in for generic application mapping • RR, Random, DIET, Graal-heuristics • Plug-in for each application kind • Description convector • Configuration of application • CCM, MPI, JXTA, P2P, DIET, GFARM, SEQ • Plug-in: from 400 to 4700 C++ lines • META • Enable constraints between any other application kinds (at the generic level)

  14. Identification of the steps of Automatic Deployment MPI Application Description CCM Application Description Resource Description Generic Application Description Control Parameters Deployment Planning Static Applications Deployment Tool Deployment Plan Execution Application Configuration

  15. Comparaison GoDIET / Adage • Déploiement sur Grid’5000 : • Entre 25 et 305 nœuds • Entre 1 et 8 grappes • Heuristique pour créer automatiquement la hiérarchie DIET

  16. Comparaison GoDIET / Adage Hiérarchie DIET générée

  17. ADAGE & LEGO • Clean implementation of the Adage model • UML-like based specifications • Separations of planner and application plugins from core • Extension of the internal generic model (GADe) • Support of graph-like generic description • In particular recursive structures like trees (for DIET) • Support of pseudo-dynamic re-deployment • Support of the G5K API • Working and stable tool • Use to deploy CCM, JuxMem & DIET elements • Cf Demonstrator talk

  18. Grid'5000 Reservation Utility for Deployment Usage • Web: http://grudu.gforge.inria.fr

  19. GRUDU – Resources Allocation We are able to reserve ressources (OAR1 & OAR2) Time parameters, date and reservation walltime Queue OARGrid sub behaviour/ Script to launch

  20. GRUDU – Monitoring We are able to monitor the status of the grid/site/a job. We are able to get instantaneous/historical data with Ganglia

  21. GRUDU - KaDeploy/JFTP GUI for KaDeploy jobs deployment File Transfert interface (local<->remote/rsync on Grid'5000)

  22. WP3: Bilan • Réalisations principales • Délivrable D3.2 • ADAGE • DIET 2.3 • GRUDU • Perspectives • Prise en compte automatique de la plateforme pour le planning • Clustering auto-stabilisant • ADAGE ? • Utilisation de l’ordonnancement de l’application du CERFACS pour un modèle régional atmosphérique: CRIP UJF (IMAG. Grenoble) • D’autres classes d’applications à ordonnancer • Ordonnancement et gestion de données: création et utilisation de DAGDA • Collaborations • Salomé (EDF) [thèse en cours] • Université de Picardie Jules Verne • Université du Nevada Las Vegas • Université d’Hawaii

More Related