460 likes | 676 Views
The NOAA Operational Model Archive and Distribution System NOMADS. Glenn K. Rutledge Meteorologist / Physical Scientist National Oceanic and Atmospheric Administration National Climatic Data Center
E N D
The NOAA OperationalModel Archiveand Distribution SystemNOMADS Glenn K. Rutledge Meteorologist / Physical Scientist National Oceanic and Atmospheric Administration National Climatic Data Center Perspectives on Building Communities for Effective Development and Application of Cyberinfrastructure SuperComputing 2004 Pittsburgh PA, November 12, 2004 Image: Unidata idv
Overview • To overcome a deficiency in model data access, some of the Nations top scientists are actively engaged in a grass-roots framework to share data and research findings over the Internet. • NCDC, NCEP and GFDL initiated the NOAA Operational Model Archive and Distribution System. • NOMADS is a distributed data services pilot for format independent access to climate and weather models and data.
The NOAA Operational Model Archive and Distribution System NOMADS Goals • provide distributed access to models and associated data • promote model evaluation and product development • foster research within the geo-science • communities (ocean, weather, and climate) • to study multiple earth systems using • collections of distributed data • develop institutional partnerships via distributed open technologies
Model Data Access • The users experience is often frustrating— • - What data of interest exist? • - Are they going to be useful to me? • - How can I obtain them in a usable form? • Time and effort are wasted on data access and format • issues. • As a result atmosphere/ocean/climate data are under- • utilized. Model inter-comparison nearly impossible.
Why Now? • What are the goals facing the GeoScience community? • Is it just access to high volume data (satellite, radar, and model)? • How will Agencies and Institutions address interoperability? • Should it be system, data or both? • Have the scientific requirements been adequately defined? • Do top down approaches adequately promote science? • How can Agencies and institutions develop partnerships while • allowing for attribution, with diverse goals and agendas? • Community building with science driven requirements!
NOMADS Uses • Climate model output and observations are vital to providing timely assessments of climate change and impacts: • climate model output statistics from NWP run in quasi-real time to identify time-dependant biases in observations; • assess the affect of missing data; • for observing network design and operation, models can be used to guide the spatial and temporal sampling frequency to resolve distributions for specific variables (temp, precip).
Uses (cont.) • Long-term stewardship of NWP analysis and forecasts. • Accurate estimates of future climate variability and trends. • Improved climate and weather assessments. • Promote collaboration of GCM and NWP researchers using • large data volumes by sub-setting. • Inter-comparisons of weather and climate data sets.
The Partnerships CDC COLA co-PI FSL GFDL co-PI LLNL NCAR NCDC PI NCEP co-PI PMEL co-PI Unidata -THREDDS BADC CEOS DataGrid CEOP LEAD NASA/GCMD NASA/SI-ESIP NSSL w/UW/SSEC NWS/COMET/HQ GMU / Alabama / others
International Partners • WCRP/WGCM JCS Data Management • British Atmospheric Data Center & NERC Data Grid (e-Science) • Climate Action Partnership" (CAP) with Commerce, Energy, State, • and EPA with Australia BOM. • GODAE, IOOS, GOOS, GCOS, HAOC, NVODS projects. • European PRogram for Integrated Earth System Modeling (PRISM) • Committee of Earth Observation Satellites (CEOS) • Coordinated Enhanced Observing Period (CEOP) • Interagency Working Group on Earth Observations (IWGEO)
Tools for Users • Pare down large file sizes of high resolution data and products. • (re-) Group different data sets to create needed products – such as initialization files for model development, analysis, and intercomparison. • Subset the data: • in parameter space • in physical space • in temporal space
Framework • NOMADS uses the Open Source http based OPeNDAP. • OPeNDAP is a binary-level protocol designed for the transport • of scientific data subsets over the Internet. • Data formats: cf, DIF, GRIB, GRIB2, BUFR, HDF, NetCDF, • ascii, ....libraries built as necessary. • APIs: JAVA-OPeNDAP, C++-OPeNDAP, NetCDF, GRIB, • BUFR, THREDDS, Python.
System Architecture NOAAPort Data Ingest* Data Access CEOS-Grid (exploratory) Obs, Eta, GFS, RUC Data Management • Data & Directory • structures “merged” • Daily Data Ingest • inter-comparison • QC and Monitoring • Index File generation • Control and DODS • metadata generation • CVS Backup (code) • HDSS/HAS Injection DAB/NOMADS Ingest Processes Hi-Res GFS, Eta, NARR and GDAS NOMADS Access NCDC ftp AAB Ingest Processes NCDC Archive Unidata IDD DSI 6172 DSI 6173 DSI 6174 DSI 6175 Dual Redundant Ingest AAB Archive Processes * NOMADS provides NCDC NOAAPort ingest processing
NOMADS System Architecture Data Access Globus Data Management NOMADS HAS Interface / QC / Metadata NOMADS on-line volume: 16.7TB NOMADS Archive Volume: ~400TB+ Data Ingest -NOAAPort Direct -NCEP ftp -Unidata IDD NSA_GKR_Mar04
The Big Picture The Grids include: CEOS-DataGrid, DOE’s ESG, NASA’s IPG, ADG, & UK’s NERC DataGrid CEOS Server Exploratory Grid effort * In process
NCDC/NCEP Data Availability • NCDC NOMADS Archive • NCEP NWP Operational Model Output from NOAAPort • POR: 2002 to Real-Time (to 1999 exists) • Eta (12km) and Global Forecast Models (1 degree) • RUC-II 40km; (FSL NOMADS has real-time 20km) • Limited Navy NOGAPS (1/2 degree FY05) • NCDC Reference Data Sets (Reynolds SST’s, GHCN) • NCEP Real-Time NOMADS • Global Forecast System GFS (AVN/MRF) (1 degree) • Hourly Eta at 12km • Regional Spectral Model (RSM) • Climate Data Assimilation System (CDAS) • AMIP Climate Monitoring • NCEP/NCAR Global Reanalysis 1&2
Model Input: GDAS Global Spectral Forecast Model and the Spectral Statistical Interpolation Cycling Analysis System (GDAS): - NOAA-15/16 AMSU-A/B TOVS 1B Radiances (IEEE) - Analysis Bias Corrected Information / Obs Toss List - SFC U/A, ACRS, Aircft (BUFR) - 6HR fcst guess from previous run (BUFR) - ERSCAT Sat obs / HIRS 14/15, MSU TOVS (IEEE) - Guess prep and and fcst guess output (BUFR) - Analysis ready QC’ed Obs. (prepBUFR) - Profiler, TOVS, Wind Obs. (BUFR) - SFC Analysis Restart Files - SST’s (GRIB), Radar VAD Winds (BUFR)
Model Input: GDAS (cont.) • NOMADS saves the minimum data necessary to regenerate model output products as close as possible to NCEP operations. • The analysis files will be in the models own coordinate system. • Files are constructed with computer and computational efficiency • in mind, and not in standard coordinate systems. • Programs to convert these files will be made available: • spectral to gaussian • gaussian to lat/lon • sigma to pressure
N.A. Regional Reanalysis • Create a long-term set of consistent climate data on a • regional scale on a North American domain • Superior to NCEP/NCAR Global Reanalysis (GR): • use of a regional model (the Eta model) • advances in modeling and data assimilation since • 1995 especially: • Precipitation assimilation • Direct assimilation of radiances • Land-surface (NOAH) model updates
NARR- Eta-NOAH Upgrades • Assimilation of Hourly Precipitation • Hourly 4-km radar/gage analysis (Stage IV) • Cold Season Processes (Koren et al 1999)patchy snow cover; frozen soil (new state variable); snow density (new state variable) • Bare Soil Evaporation Refinements • parameterize upper sfc crust cap on evap • Soil Heat Fluxnew soil thermal conductivity (Peters-Lidard ‘98)under snowpack (Lunardini, 1981)vegetation reduction of thermal cond. • Surface Characterizationmaximum snow albedo database (Robinson ’85) dynamic thermal roughness length refinements • Vegetation • deeper rooting depth in forests • canopy resistance refinements
NOMADS Archive and Users * 5-YR retention of fcsts. Long term for anal. Month 06 07 09 09 10 Existing and Projected Volume 2004
NCDC Web Interface • Three primary • methods for data • access: • Web Interface • GDS OPeNDAP • ftp w/ on the fly • Grib subsetting • On-line or • Off-line (archive) • Server-side data • computations.
Promoting Model Collaborations NCDC Web Interface (cont.) The NCDC Web Interface originally developed at NCEP: NOMADS leverages efforts across the community.
NOMADS “Web Plotter” • NCDC NOMADS • ingests 150K grids • day. POR 2002 • to present. • Any one of these • accessible in seconds • Via: OpENDAP • GDS • ftp • Web Plotter • LAS (soon)
Promoting Model to Obs. Intercomparisons NCDC Reference Datasets • NCDC reference • And others • datasets also available: • CARDS (IGRA) • GHCN • NARR • Ocean WAVE • Accessible via • OPeNDAP
Using NOMADS • Web Interface: click -plot -retrieve -analyze... • DODS+Web: wget -O $RESULT http://nomads.ncdc.noaa.gov:9090/dods/gdas/gdas2003092118/.ascii?ulwrfsfc[1:1][140:140][200:200] The above W3C Web data request returns OLR at a specific lat/lon for a specific dtg. Time to receipt ~1 sec Volume: 200bytes Traditional acceess: 1) ftp the data: 2.4Mb 2) decode (degrib) and format the data 3) develop code to locate specific variable, earth located point, and time • NOMADS Clients: open (url) (variable) (...) • The power and ease of NOMADS scripts......
Wget w/ DODS expression Outgoing Longwave Radiation Flux SFC Watts/m**2
* January Mean 500 Height (1981 to 1989) minus (1990 to 1998) * Mean & Standard Deviation for all 10 ensembles * Time required: 60 secs 'reinit' '!date' * baseURL = 'http://motherlode.ucar.edu:9090/dods/_expr_' * GKR 2/13/03 New NCAR URL baseURL = 'http://dataportal.ucar.edu:9191/dods/' expr = 'ave(z,t=387,t=483,12)-ave(z,t=495,t=591,12)' xdim = '0:360' ydim = '20:90' zdim = '500:500' tdim = '1nov1978:1nov1978' 'sdfopen 'baseURL'_expr_{C20C/C20C_A}{'expr'}{'xdim','ydim','zdim','tdim'}' 'sdfopen 'baseURL'_expr_{C20C/C20C_B}{'expr'}{'xdim','ydim','zdim','tdim'}' 'sdfopen 'baseURL'_expr_{C20C/C20C_C}{'expr'}{'xdim','ydim','zdim','tdim'}' 'sdfopen 'baseURL'_expr_{C20C/C20C_D}{'expr'}{'xdim','ydim','zdim','tdim'}' 'sdfopen 'baseURL'_expr_{C20C/C20C_E}{'expr'}{'xdim','ydim','zdim','tdim'}' 'sdfopen 'baseURL'_expr_{C20C/C20C_F}{'expr'}{'xdim','ydim','zdim','tdim'}' 'sdfopen 'baseURL'_expr_{C20C/C20C_G}{'expr'}{'xdim','ydim','zdim','tdim'}' 'sdfopen 'baseURL'_expr_{C20C/C20C_H}{'expr'}{'xdim','ydim','zdim','tdim'}' 'sdfopen 'baseURL'_expr_{C20C/C20C_I}{'expr'}{'xdim','ydim','zdim','tdim'}' 'sdfopen 'baseURL'_expr_{C20C/C20C_J}{'expr'}{'xdim','ydim','zdim','tdim'}' 'define resa = result.1' 'define resb = result.2' 'define resc = result.3' 'define resd = result.4' 'define rese = result.5' 'define resf = result.6' 'define resg = result.7' 'define resh = result.8' 'define resi = result.9' 'define resj = result.10' say 'got data' 'set lev 500' 'set lat 20 90' 'define mean = (resa + resb + resc + resd + rese + resf + resg + resh + resi + resj)/10' 'define d1 = (pow(resa-mean,2))' ; 'define d2 = (pow(resb-mean,2))' 'define d3 = (pow(resc-mean,2))' ; 'define d4 = (pow(resd-mean,2))' 'define d5 = (pow(rese-mean,2))' ; 'define d6 = (pow(resf-mean,2))' 'define d7 = (pow(resg-mean,2))' ; 'define d8 = (pow(resj-mean,2))' 'define d9 = (pow(resi-mean,2))' ; 'define d10 = (pow(resj-mean,2))' 'define stddev = pow((d1 + d2 + d3 + d4 + d5 + d6 + d7 + d8 + d9 + d10)/10,0.5)' 'set gxout shaded' 'set mproj nps' 'display mean' 'draw title January Mean 500 Height (1981 to 1989) minus (1990 to 1998)' 'set string 3 bc 1' 'draw string 5.5 .5 Mean & Standard Deviation for all 10 ensembles: ‘C20C Climate of the 20th Century Folland/Kinter' *'cbarn' 'set gxout contour' 'set ccolor 0' 'display stddev' '!date' At left is the complete script for generating mean and sdev at 500mb analyzing 20 years of “Climate of the 20th Century” from NCAR. Traditional vs. NOMADS method: Data volume transported: 100Gb vs. 2Kb Time to access data: 2 days vs. 60 sec Code development: days vs. minutes Fortran based LOC: 1000 vs. 50 LOC
Enabling private sector access: An example NOMADS Ensemble Access NOMADS Ensemble Probabilities on the fly • No need for image generation of ensembles... OPeNDAP constraint expression URL is: http://nomad3.ncep.noaa.gov:9090/dods/enshires/archive/ens20040809/ensc0_ 00z_1x1.ascii?pratesfc[3:3][125:125][277:277]
Advancing Collaborations GO-ESSP • A super-set of the original NOMADS group of data managers has formed the Global Organization for Earth Systems Science Portals • GO-ESSP http://esportal.gfdl.noaa.gov • Unidata • ESG (NCAR, LLNL) • OPeNDAP • COLA • NOMADS (GFDL, PMEL, NCDC, NCEP, others) • NASA/GCMD • BADC, BODC • WMO
Advancing Collaborations GO-ESSP • The Global Organization for Earth System Science Portal (GO-ESSP) is a collaboration designed to build the infrastructure needed to create web portals to provide access to observed and simulated data within the climate and weather communities. • The infrastructure created within GO-ESSP will provide a flexible framework that will allow interoperability between front-end and back-end software components. GO-ESSP is an international collaboration involving software developers from both Europe and the United States.
Group on Earth bservations Earth Observation Summit • Affirmed need for timely, quality, long-term, global information as a basis for sound decision making. • Recognized need to support: • Comprehensive, coordinated, and sustained Earth observation system or systems; • Coordinated effort to address capacity-building needs related to Earth observations; • Exchange of observations in a full and open manner with minimum time delay and minimum cost; and • Preparation of a 10-year Implementation Plan, building on existing systems and initiatives by European ministerial in late 2004 • Established ad hoc Group on Earth Observations (GEO) to develop Plan • Invited other governments to join.
Group on Earth bservations IEOS Data Management In draft form-community input needed!
What’s Next? 1/2 • Operational Forecasting- • Ensemble Predictions: flow-dependant prediction of weather • and climate risk- nowcasting, medium range and seasonal. • Atmospheric and Oceanic Research- • Scalar and Vector processing and Workstation models • Model output statistics; data assimilation techniques • Global Climate Change and Advanced Analysis- • Clouds, initial conditions, true coupled simulations. • Long term climate monitoring: in-situ analysis, trends, data • homogeneity, extremes, downscaling, reducing uncertainty... • On-demand Data Mining and Product Generation.
What’s Next 2/2 • Data extraction into high volume data archives for the generation of products “on-demand”. • Advanced data mining algorithms for pre-generation, or executed by (authorized) users also on-demand. • Access to mined physical processes or signatures thru data mining. • Search and location tools, portals, and long-term metadata management.
For more information... • For NOMADS Program Information see: http://www.ncdc.noaa.gov/oa/climate/nomads/nomads.html • For NOMADS Model Data Access: NOAA NCDC Main Page Upper-Air Data and Products http://nomads.ncdc.noaa.gov • Or contact: Glenn.Rutledge @ noaa.gov • Selected Publications on NOMADS and distributed data access: http://www.ncdc.noaa.gov/oa/model/publications/publications.html QUESTIONS ?