130 likes | 222 Views
Data for Climate and Energy Studies. Steven Worley Computational and Information Systems Laboratory NCAR. Topics. Scope of the NCAR Research Data Archive (RDA) Discovery and Access Highlights User ranked popular datasets Examples Near-term service improvements.
E N D
Data for Climate and Energy Studies Steven Worley Computational and Information Systems Laboratory NCAR
Topics Scope of the NCAR Research Data Archive (RDA) Discovery and Access Highlights User ranked popular datasets Examples Near-term service improvements NCAR-CSM Symposium on Climate and Energy
Scope of the NCAR Research Data Archive (RDA) Focus on atmospheric, oceanographic, and related geo-sciences observational data and derived analyses. • Some weather forecast data • Do not specialize in climate prediction datasets Active stewardship program to maintain and grow the RDA for 40+ years. • Large variety, 600+ datasets, ~ 400 TB, 4M files NCAR-CSM Symposium on Climate and Energy
Discovery and Access Highlights • Primary design feature for web portal • Data Discovery – Find Data! NCAR-CSM Symposium on Climate and Energy
Discovery and Access Highlights Multiple Methods - simple to interoperable • Find the files in our lists and download • Through your browser – limit 2GB • We create a ‘wget’ script for you – run in background on your machine – no limit • You select temporal, spatial, parameter domains • We build a file list for you • Download options as in 1 • Data is not online to the web – but, is on archive storage • We automatically stage data to online, then download • You select temporal, spatial, parameter domains - we build CURL commands - you get only the grids you select • About CURL • Client URL Library functions • Readily available on Linux OS • We use HTPPS protocols – others are available • Applies well to WMO GRIB data format • Users modify the CURL commands and script them to perform routine data extractions from RDA NCAR-CSM Symposium on Climate and Energy
User ranked popular datasets Top 30 datasets/groups FY09 ~ 6000 Unique Users Annually NCAR-CSM Symposium on Climate and Energy
One example Final Global Analysis from NOAA/NCEP • 4x Daily • Updated in the RDA 1x/day • 1° horizontal resolution • 26 vertical pressure levels, plus surface • Series starts in 1999 • Over 55 parameter fields NCAR-CSM Symposium on Climate and Energy
One example NCAR-CSM Symposium on Climate and Energy
Re-analyses Table 1: Global atmospheric and oceanographic re-analyses are one of many valuable data resources provided by external organizations that employ the expertise of RDA consultants and are the most recent major reanalyses available in the Research Data Archive. Most time periods are ongoing, that is, providers continue to produce the products gong forward in time. In general, all reanalyses also have lower temporal and horizontal resolutions than those shown above. Most reanalyses also have variables on vertical model coordinate levels, as well as large numbers of surface specific fields, and vertically integrate values. NCAR-CSM Symposium on Climate and Energy http://www.earthobservations.org/documents/geonewsletter/art008001_trenberth_article.pdf
Near-term service improvements • Current and soon-to-be workflow NCAR-CSM Symposium on Climate and Energy
Complete User Community • Advantages: • Fast access to online data – limited part of RDA • Access to all RDA content metadata • Access to RDA data processing services • HPC User Community • Advantages: • Access to full RDA • Fast computing • No login required • HPC User Community • Disadvantages: • No access to online data • Use MSS as a file server • No direct access to RDA metadata • No direct access to RDA data processing services • Require separate account to access RDA web server • Complete User Community • Disadvantages: • Slow access to MSS data – delayed mode • Have to create a separate RDA account and log in • Data processing requests take a long time to finish • Slow download speeds for some users
Complete User Community • Improvements: • Fast access to full RDA • Expanded data processing services available • Single CISL account - no separate RDA account • Faster download speeds – grid-based tools, e.g. GRIDFTP • Single “first point of contact” for user support Resolved all the disadvantages • New Challenges: • GPFS and HPSS don’t have generic file use logging • Need for metrics & services • HPSS doesn’t have sophisticated file access control • Some RDA assets have limited access policies • Abandon a functional RDA registration system – retool a 20K+ user DB • Of course, there will be more! • Big transition while maintaining RDA content building and services • HPC User Community • Improvements: • Fast access to full RDA • Access to all RDA content metadata • Access to RDA data processing services • Single CISL account • Single “first point of contact”
End • Scope of the NCAR Research Data Archive (RDA) • Discovery and Access Highlights • User ranked popular datasets • Examples • Near-term service improvements http://dss.ucar.edu/ NCAR-CSM Symposium on Climate and Energy