140 likes | 147 Views
This workshop aims to run MM5 and BOLAM weather models on the grid for ensemble forecasting and develop a generic weather model execution framework. It will support deterministic forecasting and easily adopt various other forecast models. The focus is on adopting existing procedures, simplifying execution times, and utilizing high-level tools. The workshop will utilize the GANGA framework for job management and monitoring.
E N D
Porting MM5 and BOLAM codes to the GRID Earth Science Workshop January 30, 2009 – Paris, France The SEE-GRID-SCI initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no. 211338 Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Goal • Run MM5 and BOLAM models on the grid to perform Ensemble weather forecasting • Develop a generic weather model execution framework • Support for deterministic forecasting • Easily adopt to various other forecast models (e.g. WRF, RAMS etc). Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Target workflow N.O.M.A.D.S NCEP-GFS (USA) Weather models follow a specific workflow of execution Retrieval of Initial Conditions Pre- Processing Model Run Post Processing HTTP • Framework should be able to incorporate different codes for pre/post-processing and model execution. • Parametric configuration of initial data retrieval Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Requirements • Adopt existing NOA procedures for model execution • Hide the Grid as much as possible • Give the feeling of local execution • Simplify existing procedures and improve execution times • Utilize high-level tools that facilitate better quality code and overcome low-level interactions with the Grid • Satisfy specific model requirements • Usage of commercial compiler not available in the Grid • Time restrictions for completing application execution Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Design Approach • Keep existing “command-line” look’n’feel • Re-use and improve existing code base (shell scripts) • Utilize Python language to replace various parts of the existing workflow • Exploit the GANGA framework for job management and monitoring Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Utilized Grid Services • gLite • WMS – Job management • LFC – Data management • MPICH 1.2.7 on gLite sites • Ganga • Developed in CERN. Endorsed by EGEE RESPECT program • Provides a Python programming library and interpreter for object-oriented job management • Facilitates high-level programming abstractions for job management • More information: http://ganga.web.cern.ch/ganga/ Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Implementation Details • MM5 and BOLAM codes compiled locally in UI with PGI Fortran • 3 different binaries produced for MM5, for 2, 6 and 12 CPUs respectively • MPICH also compiled with PGI. MPICH libraries used for MM5 binaries generation • Binaries were packed and stored on LFC • Downloaded in WNs before execution • Include Terrain data • Models are running daily as cronjobs. Notifications are send to users by email • Log files and statistics are kept for post-mortem analysis • Ganga also useful for identifying problems after execution Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Implemented Architecture N.O.M.A.D.S NCEP-GFS (USA) UI UI N jobs UI WMS UI Ganga CE/WN LJM (Python) ModelConfigfile Lead-In/Out (shell script) LFC Binaries and Results SE Workflow Orchestrator (Python) Results Post-Process (shell script) Model Run (shell script) Decode (Shell script) Pre-process (shell script) http mpiexec WN WN WN WN … GRID Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Ensemble Forecasting • Each member is executed as a separate job • 10 members in total, both for MM5 and BOLAM models • Each member separately downloads its initial data from NCEP servers • Whole ensemble execution is handled by a single compound job • Compound job definition, execution and management handled by Ganga constructs (job splitters) • Final stage of forecast production and graphics preparation performed locally on UI Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Initial Performance Results • MM5: Typical execution time: ~2hrs (including scheduling overheads) • Different completion times per member depending on total processors used. • 12 process version takes ~40mins per member but takes longer time to get scheduled in a grid site. • BOLAM: Typical execution time for 10 member ensemble forecast: 60-90mins (including scheduling overheads) • One member takes ~25 minutes to complete in a local cluster with optimized binary. Ensemble would take ~4 hrs locally • Overall, non uniformity of completion times due to Grid resources (un)availability Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Adopting the framework for different models • Models that implement similar workflows should be easy to adopt • Ultimately the user should only provide: • The four workflow hooks • Decode • Pre-process • Model run • Post-process • Model configuration file(s) • Definition of different initial data sources • Forecast region • Terrain data • Model binaries stored on the LFC Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Problems/Pending Issues • Problems with initial data • NCEP servers sometimes down or cannot generate requested files • Grid resources availability imbed timely execution • Not all members manage to complete on time • Some may still be in scheduled state when time expires • Grid robustness and predictability • Jobs may be rescheduled while running in different sites for no apparent reason • Central grid services might be unavailable (WMS, LFC) • MM5 sensitive to execution environment • Dying processes while model in parallel section • MPI notoriously not well supported by grid sites (some sites “better” than others) Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Future Work • Application is still in pilot phase • Planned to execute “super-ensemble” runs by April • Multi-model multi-analysis ensemble forecasting combining results from MM5, BOLAM, NCEP/ETA, NCEP/NMM • Presentation to be delivered in UF4 in Catania. • Anticipating more resources and better support from existing once. Earth Science Workshop, Paris FRANCE, 30 Jan 2009
Questions Earth Science Workshop, Paris FRANCE, 30 Jan 2009