450 likes | 597 Views
E-infrastructure shared between Europe and Latin America. European Meteorological Society 7 th EMS / 8 th ECAC El Escorial (Spain), 1-5 Oct 2007. GRID distributed computation of nested climate Simulations and data-mining. On behalf of the EELA project.
E N D
E-infrastructure shared between Europe and Latin America European Meteorological Society 7th EMS / 8th ECAC El Escorial (Spain), 1-5 Oct 2007 GRID distributed computation of nested climate Simulations and data-mining. On behalf of the EELA project EELA is a project funded by the European Union under contract 026409 V. Fernández-Quiruelas (1), J. Fernández (1), A. S. Cofiño (1), C. Baeza (3), M. Carrillo (2), F. García-Torre (1), R. M. San-Martín (2), R. Abarca (3) and J. M. Gutiérrez (1) R. Mayo (4) on behalf of the EELA team. (1) Dept of applied mathematics and computing sciences. University of Cantabria. Spain, (2) Servicio Nacional de Meteorología e hidrología. Peru, (3) Universidad de Concepción. Chile (4) CIEMAT, Spain
GRID Computing Applications draw computing power from a Computational Grid in the same way electrical devices draw power from an electrical grid
GRID Computing • Developed in the mid-90 • Use of distributed, heterogeneous, dynamic and, usually, parallel computational resources. • Middleware and standard software to build applications (Globus Toolkit, OGSA, …) • Several research projects (and commercial products) developing this technology. Applications draw computing power from a Computational Grid in the same way electrical devices draw power from an electrical grid
EELA Goals E-infrastructure shared between Europe and Latin America • Goal: To build a bridge between consolidated e-Infrastructure initiatives in Europe and emerging ones in Latin America. • Objectives: • Establish a human collaboration network between Europe and Latin America • Setting a pilot e-infrastructure in Latin America • Identifying and promoting a sustainable framework for e-Science in Latin America
EELA structure EELA is structured in four Working Packages: • WP1. Project administrative and technical management • WP2. Pilot testbed operation and support • WP3. Identification and support of Grid-enhanced applications • Task 3.1. Biomed Applications • Task 3.2. High Energy Physics Applications • Task 3.3. Additional Applications: • E-Learning • Climate • WP4. Dissemination activities
EU Spain: CIEMAT, CSIC, UPV, RED.ES, UC Italy: INFN Portugal: LIP Latin America Venezuela: ULA Cuba: CUBAENERGIA Chile: UTFSM, REUNA, UDEC Peru: SENAMHI Mexico: UNAM Argentina: UNLP Brazil: UFRJ, CNEN, CECIERJ/CEDERJ, RNP, UFF International CLARA CERN Partners
WP 3.3b. Climate We deal with a climate challenge with huge socio-economical impact in Latin America: El Niño phenomenon. The Grid helps to access the infrastructure and know-how in a user-friendly way. Three different applications have been identified: • Global atmospheric circulation model (CAM) • Regional weather model (WRF) • Data-mining clustering tools (SOM) Scientific challenge: High resolution regional simulations over Latin American regions for El Niño 1982-1983 and 1997-98 strong events. Comparison with historical local data, including sensitivity studies to SST and parameterizations. The problem is well suited for its execution on the Grid since many independent simulations will be needed.
WP 3.3b. Climate We are currently performing CAM simulations perturbing the SST. pertSSTc(t,x) = SST(t,x) + c * Pattern(x) Where c is a random number in the interval (-2.5, 2.5) c = -2.5 ~regular year c = 0 ~niño’97 c > 0 SST anomalies stronger than niño’97
Computational challenge Challenging computational problem with nontrivial dependent relationships among the applications. A cascade of dynamic dependent jobs is adopted. The cascade of applications interacts with the middleware to: • Prepare and submit dependent jobs. • Store and retrieve the generated data sets (data sharing). • Manage metadata (for the data sets and application status). • Restarting broken experiments SOM WRF (par 1) CAM SOM WRF (par 2) … … SOM WRF (par n) SST + other forcings SST PDF SE SE SE SE SE SE SE
Grid Enabling Layer ? CAM
Grid Enabling Layer The use of this additional software layer has several advantages: • Easier updates of the model • Easier programming (shell, perl, python, … instead of Fortran) G E L CAM
Info & Data Flow LFC DATA File catalog CAM SE Metadata Storage Element AMGA Metadata Catalog status information WRF
Grid Enabling Layer GEL tasks: • Download model-required files from LFC • Upload model-generated files to LFC • Extract metadata from model output files and publish them in AMGA. • Upload model restart files to LFC and restart information to AMGA • Publish model status in AMGA G E L CAM
UI Info & Data Flow (past) LFC DATA CAM SE Metadata Portal AMGA status information WRF If a submitted job is not running, the User queries AMGA about whether it was successful. If not, the User checks what was the last restart file and restarts the CAM job. While a CAM job is running, the User queries AMGA about the data sets produced by CAM and then triggers the WRF jobs. And so on with the SOMs.
Last stable status Using GENIUS to interact with the applications (CAM+WRF).
&camexp absems_data = ‘lfn:/grid/eela/.../abs_ems_factors.nc‘ aeroptics = ‘lfn:/grid/eela/.../AerosolOptics.nc‘ bnd_topo = ‘lfn:/grid/eela/.../topo-from-cami.nc‘ bndtvaer = ‘lfn:/grid/eela/.../AerosolMass.nc' bndtvo = 'lfn:/grid/eela/.../pcmdio3.nc‘ bndtvs = 'lfn:/grid/eela/.../sst_HadOIBl.nc' caseid = 'nino82d' iyear_ad = 1982 start_ymd = 19820101 ncdata = 'lfn:/grid/eela/.../cami.nc‘ mfilt = 1,4,1 ...
Specific requirements Climate modeling poses specific challenges for the GRID • Big storage requirements • CPU-intensive • Dependent jobs • Long lasting jobs • A climate modeling Experiment may consist of several model Realizations, which are likely to be composed of several Jobs
Info & Data Flow (now) LFC DATA CAM SE Metadata coordinator AMGA status information
UI Info & Data Flow (future) LFC DATA CAM SE Metadata coordinator Portal AMGA status information WRF
Conclusions • The EU-funded EELA project aims at establishing an e-infrastructure and scientific collaboration between European and Latin American countries. • Within the climate task, we have implemented a sequence of climate applications CAM+WRF(+SOM) which runs integrated in the EELA testbed providing regional simulations for a given SST and other forcings. • The applications interact with the GRID services transferring data and status information. They can be easily managed by the User through a web portal. • Climate modelling poses specific requirements on the GRID, which are not solved by the current middleware – dependent and long lasting jobs, experiments composed of simulations split into jobs.
References CAM Application: http://www.ccsm.ucar.edu/models/atm-cam/ User’s Guide to the NCAR Community Atmosphere Model (CAM 3.0): J. R. McCaa, M. Rothstein, B. E. Eaton, J. M. Rosinski, E. Kluzek, M. Vertenstein: Climate And Global Dynamics Division, NCAR, Boulder, Colorado, 2004, 88 pp. W. D. Collins, C. M. Bitz, et al. (2006) “The Community Climate System Model: CCSM3”, Journal of Climate, Special Issue on CCSM, 19(11). WRF Application: http://www.wrf-model.org/ Skamarock, W. C., J. B. Klemp, J. Dudhia, D. O. Gill, D. M. Barker, W. Wang and J. G. Powers, 2005: A Description of the Advanced Research WRF Version 2. NCAR Technical note, 2005, 88 pp. Michalakes, J., J. Dudhia, D. Gill, T. Henderson, J. Klemp, W. Skamarock, and W. Wang, 2004: "The Weather Reseach and Forecast Model: Software Architecture and Performance,"Proceedings of the 11th ECMWF Workshop on the Use of High Performance Computing In Meteorology, 25-29 October 2004, Reading U.K. Ed. George Mozdzynski. SOM Application (grid version): http://www.meteo.unican.es F. Luengo, A.S. Cofiño, and J.M. Gutiérrez (2004) “GRID Oriented Implementation of Self-Organizing Maps for Data Mining in Meteorology”, Lecture Notes in Computer Science, 2970, 163 – 171.
CAM output Monthly accumulated precipitation over Perú
Survival guide for reading Grid documents AMGA : ARDA Metadata Grid App. ARDA : A Realisation of Distributed Analysis for LHC CE : Computing element CLARA : Cooperación latino-americana de redes avanzadas EGEE : Enabling Grids for E-sciencE GGF : Global grid forum GIIS : Grid index info service GILDA : Grid INFN laboratory for dissemination activities GMA : Grid monitoring architecture GRAM : Grid res. alloc. manager GRIS : Grid resource info service GSI : Grid security infrastructure INFN : Istituto nazionale di Fisica nucleare JDL : Job description language LCG : LHC computing grid LDAP : Lightweight directory access protocol LFC : Logical file catalog LHC : Large Hadron Collider MDS : Monitoring & discovery system NREN : National Research and Education Network OGSA : OpenGrid services architecture PKI : Public key infrastructure RB : Resource broker R-GMA : Relational GMA SE : Storage element VOMS : Virtual organization membership service WS : Web service Acronyms
CAM & WRF The Community Atmosphere Model (CAM) and the Weather Research and Forecasting (WRF) models are state-of-the-art atmosphere (global and regional) models developed at NCAR. Output format: NetCDF The models need to be adapted to interact with the GRID (i.e. with the middleware). This would require deep model modifications. Instead, we only modified slightly the model source code to call other applications to interact with the GRID on behalf of the model: Grid Enabling Layer
The CAM namelist needs to be prepared and provided here by the user. (not very user-friendly, yet)
Currently, the regions are prepared off-line and uploaded to the LFC.
All CAM simulations are available for regionalization with WRF, but the coupler CAM -> WRF is NOT yet implemented
However, we have other input data sets for WRF available in the catalog.
The output file can be downloaded in NetCDF format or accessed via a THREDDS or OpenDAP aware application.
For instance, the toolsUI java application from Unidata can load the OpenDAP (DODS) address and access only the requested portions of the data
NOAA provides user-friendly access to setup regional domains for WRF
This Java Web Start application could in the future be launched from our web portal as starting point to design a regional simulation
UI Upload info Insert entry in WRF collection AMGA Update status Update status Insert runon CAM collection AMGA Restart checkpoint Insert entry in CHECKPOINT collection AMGA Updatehistory Insert entry in HISTORYCAM collection AMGA CAM Application Workflow Resource Broker submiter 1 Computing element Generate CAM.jdl 2 3 Submit CAM.jdl CAM Insert entry in CAM collection AMGA