190 likes | 210 Views
Integrated Grid workflow for mesoscale weather modeling and visualization. Zhizhin , M., A. Polyakov, D. Medvedev, A. Poyda, S. Berezin Space Research Institute of the Russian Academy of Sciences. Abstract.
E N D
Integrated Grid workflow for mesoscale weather modeling and visualization Zhizhin, M., A. Polyakov, D. Medvedev, A. Poyda, S. Berezin Space Research Institute of the Russian Academy of Sciences
Abstract • For the model input and output we use a scalable parallel storage and data mining system called ActiveStorage. It can store different types of weather data, provided they are in the same Command Data Model (UNIDATA CDM): NCEP reanalysis, NCDC stations weather data, MM5 model output. • The MM5 is a mesoscale weather forecast model. For the input boundary conditions the model takes basic parameters such as elevation, air pressure and temperature, etc. It can ingest reanalysis and direct observation data. As the output the model provides high-resolution regional weather grids. • To make the MM5 input data and the modeling results accessible on the Grid to the Earth Science community, we have developed a set of grid services (resources and activities) inside the OGSA-DAI (both ver. 2 and 3) grid service container. • To visualize the weather data we have developed a special plugin for the NASA World Wind which can read the data directly from the OGSA-DAI resources and plot it over the 3D globe in different ways, such as contour lines, filled areas and vector fields.
Active Storage, Modeling, Data Mining and Visualization Services
ActiveStorage • ActiveStorage is a generic storage for arrays of primitive data types. • Its data model is based on the Unidata’s Common Data Model, used in netCDF, HDF5 and OpenDAP. • Basically, ActiveStorage is a SQL Server database with CLR stored procedures and a client library. • The stored procedures and the client library provide an abstraction layer for data access. • Large arrays are split into chunks and can be spread across several parallel database servers for better performance.
ActiveStorage components SQL Server 2005/2008 DB Metadata tables Data and directory tables Client library Stored procedures
Common Data Model This is the Common Data Model (CDM) used in the recent versions of OpenDAP, netCDF and HDF5. Its purpose is the representation of multidimensional scientific data.
How it works 1. Pass multi-dimensional data request to the client library 2. Issue commands to the database server SQL Server DB Application Client library 3. Select the requested data from several chunks 3. Return the data parts to the client library 4. Assemble the data parts into one multi-dimensional array
Parallel query processing SQL Server DB 1 Application Client library SQL Server DB 2
Parallel query performance 1 database server 4 parallel database servers
NCEP/NCAR Weather Reanalysis • Continually updating gridded data set • Incorporates observations and global climate model output • 74 weather parameters • 5000 netCDF files, 30 – 500 MB each • Time coverage: • 1948 – 2008 • 4-hourly values • Grids: • Regular grid, 2.5 x 2.5 degrees • T62 Gaussian grid, 192 x 94 points.
NCDC Integrated Surface Database Fixed ground stations Ships Mobile stations Buoys • 1901 – 2008 time coverage. • 30 million sensors. • 1.7 billion observations. • 470 000 ASCII files packed with gzip. • 50 GB packed; 400 GB unpacked. Control data section Mandatory data section Section marker Additional data section 0189010020999992007022817004+80050+016250FM-12+000899999V0202201N008019999999N0090001N1+00631+00541098651ADDGA1031+003009999KA1120N+99999... date time lat lon Group marker Parameter group When you’ve downloaded and unpacked the data...
MATLAB script using ActiveStorage library import ru.wdcb.mdb.NcConnector import com.microsoft.sqlserver.jdbc.SQLServerDriver s = 'jdbc:sqlserver://localhost:1433;databaseName=NCEP_01;user=guest;password=guest'; connector = NcConnector(); ncid = connector.nc_open(s,0); varid = connector.nc_inq_varid(ncid,'air'); origin = [0 0 10 10]; size = [80000 1 1 1]; stride = [1 1 1 1]; A = connector.nc_get_vars_short(ncid,varid,origin,size,stride); plot(A, 'DisplayName', 'A', 'YDataSource', 'A'); figure origin = [0 0 0 0]; size = [1 1 73 144]; stride = [1 1 1 1]; B = connector.nc_get_vars_shortm(ncid,varid,origin,size,stride); B = reshape(B,[73 144]); imagesc (B); figure(gcf);
Activities for data export • XML output stream • We have plugin for NASA World Wind to visualize XML-formatted data • Can easily be transformed using XSLT to web page or another XML document, e.g. MS Excel • Can be used as input for ESSE fuzzy logic search engine • NetCDF binary data file • Standard for scientific data storage in files • There are several visualization programs for NetCDF • Compatible with Unidata Common Data Model standard
Data flow management by OGSA-DAI OGSA-DAI query from single data source OGSA-DAI query from distributed data sources
Parallel mesoscale weather model MM5 • Same Source Parallel MM5 • Source code for the parallelMPIand the single process MM5 model are the same • Automated parallel code generation from MM5 sources by ANL: • FLIC compiler • RSL library for model domain segmentation and message exchange • We have ported MM5 code to the MS Windows Server 2008 HPC platform
Visualizing data from ActiveStorage with NASA WorldWind A NASA WorldWindplugin, developed at the Moscow State University allows to retrieve data from ActiveStorage via an OGSA-DAI service. Several kinds of visualization are available: -isolines - color map - vector field OGSA-DAI services can be used by other applications to retrieve data from ActiveStorage
NASA World Wind as a grid client Using OGSA-DAI services and a special API plugin, the NASA World Wind can visualize both the MM5 input and output datasets