1 / 16

Gridded Data Sub-setting Services through the RDA at NCAR

Gridded Data Sub-setting Services through the RDA at NCAR. Doug Schuster, Steve Worley, Bob Dattore , Dave Stepaniak. Gridded Data Sub-setting Services Through the RDA at NCAR. Research Data Archive (RDA) Overview Problem Background Required Infrastructure Current Services

iolana
Download Presentation

Gridded Data Sub-setting Services through the RDA at NCAR

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Gridded Data Sub-setting Services through the RDA at NCAR Doug Schuster, Steve Worley, Bob Dattore, Dave Stepaniak

  2. Gridded Data Sub-setting Services Through the RDA at NCAR • Research Data Archive (RDA) Overview • Problem Background • Required Infrastructure • Current Services • Future Directions

  3. RDA Overview • Total archive volume over 1.3 PB • 8000+ unique users annually • Operational and Reanalysis model outputs Meteorological and Oceanographic Observations Remote Sensing Observations • Topography/Bathymetry, Vegetation, Land Use

  4. Problem Background Data Volume

  5. Problem Background • Large computational/storage resources needed • Store data • Extract desired data from large grids/files • Convert data to desirable format(s) Scientific data centers have these resources Individual researchers generally don’t

  6. Problem Background • Goals • Make data more accessible and easier to use for individual researchers • Reasonable access volumes • Desired data formats • User defined parameters/grids • Researchers stay focused on research

  7. Required Infrastructure Command Line Interface Web Interface Powerful Computing NCAR HPC/DAV Large Disk Storage (500 TB) Generalized Software Tools -Control system (RDAMS) -Sub-setting -Format conversion Rich and Detailed Metadata Databases (RDADB)

  8. Required Infrastructure • Rich Metadata Databases (key ingredient) Metadata DB Support Efficient Backend Processing Provide Scalability Drive Interfaces File attribute metadata: Name, Dataset, Location, Format File content metadata: T(C,D,T,L,L) RH(C,D,T,L,L) Vort(C,D,T,L,L) Vis(C,D,T,L,L) PcpR(C,D,T,L,L)

  9. Current Services • Sub-setting available on 13 datasets • ERA-I, CFSR, Operational Model, EaSM • Also available on select observation sets • Sub-setting options • Parameter selection • Spatial region selection (limited availability) • Available output formats • Native GRIB formats • NetCDF format

  10. Current Services

  11. Current Services • Sub-set requests • Processed in delayed mode • User notified by email when request is ready • Download data via server provided wget scripts

  12. Current Services

  13. Current Services

  14. Future Directions • Spatial Interpolation • Faster Request Processing (NWSC) • Include More RDA Datasets • Improved Access Portals • Additional Output Formats • Web Service Access

  15. Summary • Data Analysis Research Challenges • Large and Growing Data Volumes • Numerous Formats • RDA – Supply “User Friendly”Data • Parameter and Spatial Sub-Setting • Format Conversion • Improved and Additional Services http://dss.ucar.edu schuster@ucar.edu

More Related