1 / 11

Earth Science Data and Information System (ESDIS) Project

Earth Science Data and Information System (ESDIS) Project. John Moses AURA Data Systems Working Group September 27, 2010. Data Preservation - Goal. Preserve NASA’s Earth Science data for future generations Three aspects of preservation

kenley
Download Presentation

Earth Science Data and Information System (ESDIS) Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Earth Science Data and Information System(ESDIS) Project John Moses AURA Data Systems Working Group September 27, 2010

  2. Data Preservation - Goal • Preserve NASA’s Earth Science data for future generations • Three aspects of preservation • Maintaining bits with no loss as they move across systems and media, as well as over time • Ensuring readability over time • Providing for long-term understandability • While NASA is not a “permanent archive” agency, • It maintains a “research archive” for as long as data are used for scientific research or until responsibility is transitioned to permanent archives • Critical data are backed up off-site

  3. Data Preservation - Approach What we do for the operational phase of the mission we will continue to do: • Maintain bits with no loss • Compute and store checksums at every stage • Copy data periodically into newer media and ensure that neither storage media nor readers become obsolete • Ensure readability over time • Maintain currency of storage and reader hardware (as stated above) • Maintain format-dependent read software tools or • Eliminate dependence on specialized software libraries • Develop machine- and human-understandable documentation of internal details of file structures to enable future users to write read software • Providing for long-term understandability • Maintain documentation and ancillary data associated with data products. • Work out the details of these items with PIs and other key individuals well ahead of the end of missions

  4. Data Preservation Planning • ESDIS guidelines for long term retention and preservation of critical EOSDIS observational recordsinvolve: • Identifying, organizing and securing the critical records of the observational data and information created by the mission’s distributed research community • Organizing the data and information for preservation at the right time: • Often deferred until late in the mission life so we capture validated results • But not too late - while we still have the knowledge and experience of the Principal Investigator team and science community • Documentation Criteria: • Enough to understand and reconstitute what happened with the dataset (e.g., production history, software build results, versions of toolkit and support libraries used) • Sufficient to allow regeneration of higher level science products (e.g., how-to-build handbooks, records of input auxiliary datasets, references to published verification and validation results)

  5. Current Aura Mission Archive • Critical Data • The Level 0 and Level 1 calibrated and geo-located radiance data for use in developing refined climate records… and any other data sets or products needed to interpret them. • Ancillary datasets needed to generate higher-level products • The EOS standard Level 2 and Level 3 products • Readability over Time • DAACs will migrate data to new media as part of ongoing technology refresh • ESDIS will continue to sponsor maintenance of HDF, HDF EOS5 extension libraries and reader software at the DAACs • Long-term understandability • DAACs will continue to archive information about the data products, including metadata, readme’s, DIFS, ATBDs and web pages

  6. Aura Mission Archive Issues • Additional Information must be organized • Data, software, documentation, and engineering reports is coming from distributed teams – SIPS, PI’s, Science Team members, algorithm developers, instrument vendors • Get involved in the process: all have a vested interest in making sure their contribution is properly acknowledged, organized and preserved for future use • Fill in the knowledge gaps and missing links • Look for what is not explicitly recorded in documentation, reports and publications, Readme’s, ATBDs, ICDs and Working Agreements • TBD distribution services and new datasets • Services for the additional information from the Science Teams • Archive of validation campaign datasets from AVDC

  7. Instrument Archive Process

  8. Envisioning Future Users • Think of the most likely uses of the data – what will future researchers do with the data? • e.g., Standard observational products: for detecting geophysical phenomena, comparison to other instruments and models, as input to models • Tracking down suspicious-looking artifacts • Improving uncertainty attributes, biases • Regenerating results or derivingnew products • Involves reuse of software and Ancillary data • Depends on confidence in L1b, L2, L3 products • e.g., What is the critical data – L0 or L1b?

  9. Archive Level of Service • Users of the Aura products will likely find some of the additional information useful: • Content from SIPS & Science Team’s web sites • Production history • Production software source code • Information archived but not needed online for immediate access • Lower level data (e.g. Level 0, Orbit & Attitude). • Pre-flight instrument engineering data and reports

  10. Backup

  11. Major Types of Additional Information • Footnotes: • Joint NASA-NOAA Workshop, USGCRP, LTA Workshop Report, 1998 • "Instrument/sensor characteristics including pre-flight or pre-operational performance measurements (e.g., spectral response, noise characteristics, etc.) • Instrument/sensor calibration data and method • Processing algorithms and their scientific basis, including complete description of any sampling or mapping algorithm used in creation of the product (e.g., contained in peer-reviewed papers, in some cases supplemented by thematic information introducing the data set or derived product) • Complete information on any ancillary data or other data sets used in generation or calibration of the data set or derived product • Processing history including versions of processing source code corresponding to versions of the data set or derived product held in the archive • Quality assessment information • Validation record, including identification of validation data sets • Data structure and format, with definition of all parameters and fields • In the case of earth based data, station location and any changes in location, instrumentation, controlling agency, surrounding land use and other factors which could influence the long-term record • A bibliography of pertinent Technical Notes and articles, including refereed publications reporting on research using the data set • Information received back from users of the data set or product” 1

More Related