250 likes | 267 Views
Explore key steps in scientific investigations in atmospheric sciences, featuring data access, analysis, and community sharing. Learn how to search and discover data through a web-based information server with salient features and a brief archive history. Discover datasets used for global observations and reanalyses, along with operational analyses. Understand the concerns and highlights of data archives, including distribution and access. Delve into outstanding features of atmospheric models and reanalyses, including data distribution methods and user demographics. Explore future trends in data access and analysis, including serving diverse user groups and developing online data systems for enhanced scientific research.
E N D
Scientific Investigations; Support from Research Data Archivesfor Computing in Atmospheric Sciences 2001 29 October, 2001 Steven Worley National Center for Atmospheric Research Scientific Computing Division
Key Steps of Scientific Investigations • Formulate the questions and review the state of understanding • Search and discover data • Access data • Analyzes data • Community sharing and archive • Document new understandings
Search and Discover Data • How? Web based Information Server • Salient Features • 2.5K + html pages (metadata) • All datasets are described (500+) • Location of all data files in MSS • Higher level information • Catalogs • Project specific descriptions Always current dataset descriptions
Features • Organization Navigation • Archive Navigation • Pull down menus • Search • Project Links
Dataset Page • Title and Brief description • Systematic Navigation • Metadata highlights • Period of Record • Usage • Variables • Related Sites (NOAA) • Contact Person • Related Datasets
Brief Archive History and Specifications • Started in middle 1960’s, (35 years) • Managed by nine people • 211K data files • 17 TB in a MSS • 530 datasets – all sizes
Global Observations • Usages: • Input for global atmospheric reanalysis • Basic long term climate assessment and case studies
Operational and Composite Analyses • Daily SLP is a small but very popular dataset, e.g. NAO evaluations • Two main operational centers provide the best current analyses
Concerns; • Restricted distribution • U.S. non-profits and UCAR members only • Need online authentication and authorization for easy access • Key Aspects • Medium size archive – 170 Gigabytes • multi-(product, temporal res., spatial res.) - complex
Highlights • Frequent updates to FNL, 1º, daily via FTP • High resolution N. America product, ETA at 40km • No distribution restrictions or cost
Reanalyses • Notes: • ERA-15 is finished, ERA-40 is running now • NCEP II, primarily experimental run
Outstanding Features • Three different coordinate surfaces • Very long analysis, 2+ Terabytes size • Unrestricted distribution • CD-ROMS are very popular
Countries Receiving Reanalysis CDROMs • Highlights • Over 8900 CDROMs 1997-09/2001 • Recipients; U.S. 46%, Japan 11%, (Canada, UK) 4%, (Germany, India) 3%, (Australia, S.Korea, Spain, Mexico, Norway, Russia, France) 2%
Reanalysis Users for 2001 (4th qtr estimated) 209 From the MSS [157 Jan.-Sep.] 47 On CDROM [35] 48 Custom data orders on FTP or Tape [36] 540 From the online server [406] 844 Total Served
Reanalysis Data Distributed for 2001 (4th qtr estimated) • 9616 GB from the MSS [7230 GB Jan.-Sep.] • 808 GB On CD-ROM [935, @650Mb/CDROM] • 1383 GB Custom orders, FTP and tape [1040] • 88 GB From the online server [66 GB] • 11895 GB, 11.9 TB Total
High resolution atmospheric models focused on energy and hydrology cycles. GCIP Model Data Center Collection • Critical data for N. American mesoscale studies • Complete archive is about 1 Terabyte GCIP: GEWEX Continental-Scale International Project / GEWEX : Global Energy and Water Cycle Exper.
6-yr Mean T at 5 meters University of Miami Ocean Model Data MICOM; Miami Isopynic Coordinate Ocean Model, 1/12th degree 70N to 28 S, 16-20 layers
Dataset Sizes and Scales • Today • ~ 800 Unique users • ~ 12 Terabytes data transferred • 2 Terabyte dataset size • Example: NCEP/NCAR Reanalysis • Near Future Excludes TB-PB Level 0 and 1 satellite and the super scale experimental models • Numbers of Users, ~ same • Data transferred, 5x to 10x more ? • Dataset size, 2-20 TB • Examples: • Ocean and Atmosphere models • ECMWF Reanalysis (ERA40)
Access to Data Methods • NCAR computers • From the local MSS • Web data server • Custom data packages – by request (FTP, tape, CDROM) Users • World class programmer • Research Scientist • Graduate Students • Undergraduate Students
Data Access in the future • Do we continue doing what we are doing? “Absolutely” Why? It Works • Over 1000 users annually • Very diverse skills • The archive is a heterogeneous collection • Many formats (ASCII, Binary, GrIB, BUFR, netCDF, HDF) • Many sizes (1 MB to 2 TB) • Capable of serving large and small projects Maintain a variety of flexible methods
Data Access in the future • Keys to handling future larger collections • Plan to create useful data products • Condensed datasets from high resolution output • Group most popular variables products together • Serve many, e.g. CDROMS and WWW • Continue to develop emerging online data systems • User driven subset selection with graphics and data download options • Server-side elementary analysis • Multi-dataset comparisons • Statistical summaries and basic meteorological calculations • Our development is the “Community Data Portal”
Data Analysis • Tools • NCAR Command Language (NCL) software • Features in brief • I/O for many ‘standard’ data formats • Easy adaptations to read any format • 100’s meteorological functions • “Publication quality” graphics • The CDP is capable of analysis • NCL is one of several middleware packages
Community Sharing • Support for the scientist • A place to distribute new data results • Possibly with authentication and authorization control • E.g. model outputs • Spin off benefit • New data resources for the archive • Many users can then use new product
a b • NCEP Operational Analyses blended with QSCAT Satellite data • Wind Stress Curl, 01/24/2000 1800 UTC • NCEP Operational ONLY • NCEP + QSCAT swaths • OI blend of NCEP + QSCAT • Blending by Colorado Research Associates • We archive all three products. c
Key Steps of Scientific Investigations • Formulate the questions and review the state of understanding • Search and discover data • Access data • Analyzes data • Community sharing and archive • Document new understandings