80 likes | 97 Views
Science Data in the Science Mission Directorate (SMD). Jeffrey J.E. Hayes Program Executive for MO & DA, Heliophysics Division August 17, 2011. Science Data: Guiding Principles.
E N D
Science Data in the Science Mission Directorate (SMD) Jeffrey J.E. Hayes Program Executive for MO & DA, Heliophysics Division August 17, 2011
Science Data: Guiding Principles SMD has stewardship responsibility for the integrity and preservation of science data assets as a national resource and ensure usability for the U.S. scientific (and world) community. There are a couple of implications: • Open Data: Universal access, sharing, and collaboration to science community, educators, students, and general public (implies standards). • One Size Doth Not Fit All: • Organize around science discipline and allow for diversity and tailored approaches; • Each science division responsible for its data environment, often through partnerships with other federal agencies and international partners; • Coordination and federation as appropriate under the auspices of SMD Chief Scientist.
Science Data: Preservation Aspects SMD has set up a series of discipline-specific data centers (different meaning from the OCIO definition) to carry out the following: • Maintaining bits with no loss as they move across systems and media, as well as over time (forward/backwards compatibility with evolving technology) ; • Ensuring readability over time (again evolving technology issues); • Providing for long-term understandability (involves metadata and documentation long after a mission and expertise has ended); • Publically funded data must be accessible to the public (evolving presidential and NARA requirements – this is the law).
Science Data: SMD Policy on Science Data • Project Data Management Plans are developed at initial stages of science mission/project describing all data handling aspects, including data reduction, pipeline processing, distribution and delivery to public archive. Active data repositories are usually co-located to where the science expertise resides. • Data management issues are addressed throughout the implementation process to assure appropriate resources are applied as critical element for mission success. • Science mission processing systems provide all the data functions during the operational phase of the mission, including production of the higher level data products and serving to the community. • Timely delivery of data to public archives is a key performance measure throughout life of mission.
Science data • Science missions are responsible for identifying and organizing the data and metadata and other information for long-term preservation and utilization in post-mission research. Such identifications are concurred by SMD. • These data constitute the “final” archive record for the mission which, in general will include: • Level 0 data, along with software tools, algorithms, and ancillary data to regenerate higher level data products; • Engineering reports and other relevant “housekeeping” data; • Best data products from the mission with sufficient documentation to understand and reconstitute that product. • Organized late in mission life to incorporate recalibrations, reprocessing, and other validated results through operational phase, but early enough to capture knowledge and lessons learned from PI teams and science community.
Science data After the mission ceases operations research archives continue to maintain and serve the science data so long as data are used for scientific research. This latter may have a caveat, in that we are now seeing requirements from the National Archives that may require the retention, in perpetuity, of all data collected using public funding. The implications of this are being worked. Research archives (aka data centers in SMD parlance) are generally aggregated by science discipline, or communities of practice, to provide science expertise in using the data. These are not necessarily located at NASA or government centers, but are run at universities or non-profits under grants or contracts. These archives continue with the responsibility for all aspects of the preservation of their holdings: maintaining the integrity of the resources by safeguarding them against loss or corruption; and provide off-site back-up and recovery facilities. Note that some data centers house the results of computer models. The same logic applies to these data as to data collected via missions, and are considered as important to the communities they serve.
Science Centers Breakdown • Earth Science Data Systems Program—DAAC’s, Advancing Collaborative Connections (ACCESS), etc. • Astrophysics—Science Archive Research Centers, Astrophysics Data System (ADS) bibliographic database, NASA Extragalactic Database (NED), etc. • Planetary Data System —Topical nodes, Data nodes, et al. • Heliophysics—Mission Science Centers, Solar Data Analysis Center, Space Physics Data Facility, VxO’s, etc. • National Space Science Data Center—Legacy Long-term Archive, Deep Archive (safekeeping back-up) for Space Physics, Planetary Sciences.
Science Centers Challenges: • SMD (broadly speaking) has 2 classes of missions: • Competed missions, like Explorers which are proposed by the science community to an AO (PI missions); • Strategic missions which the agency undertakes as recommendations from the National Academy of Sciences. • Differing approaches to the way data is archived/preserved/maintained: • PI missions tend to have the data center at the host institution (until final close out of said mission, when the data is delivered to NASA). These active data centers are distributed all over the country and are hosted by FFRDCs, for-profit industry, universities, government labs (non-NASA); • Strategic missions tend to have their data centers hosted at NASA centers, but there are enough exceptions to make this statement only vaguely general. • The rapidly evolving nature of IT; we (NASA) should not impose new requirements on missions, as • It was not solicited in the AO (procurement law); • Represents a loss to the baseline science by imposing an unfunded mandate on any given mission (i.e. lose science analysis to pay for new requirements).