140 likes | 152 Views
This paper explores the importance of data management and preservation in maintaining the trust and integrity of scientific research. It discusses the challenges of preserving digital data and provides suggestions for institutional data providers and investigators.
E N D
Science Data ManagementImplications for the Ethos of ScienceRuth Duerr, Mark A. Parsons
On the Ethos of Science • A fundamental component of science is trust • Society must trust that the outputs of science are accurate and unbiased • Scientists must be able to trust the work of others in the field Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
Concepts of the Scientific Method • Results should be repeatable • Results should be published in peer-reviewed journals • Source materials should be explicitly acknowledged • Data and information should be available Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
On the Importance of Data “… a scholar’s contribution is measured by the sum of the original data that he contributes. Hypotheses come and go but data remain.” - Santiago Ramon y Cajal, Advice to a Young Investigator (1897) Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
What is the Problem? Digital data has changed the paradigm Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
On the Importance of Data Preservation “Preservation without access is pointless; Access without preservation is impossible!” - heard in the halls of NSIDC, 2004 Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
Digital Data is Difficult to Preserve “digital objects require constant and perpetual maintenance, and they depend on elaborate systems of hardware, software, data and information models, and standards that are upgraded or replaced every few years” NSF and Library of Congress, August 2003 Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
Making Data Available • Historically the scientist acquired and published the data directly • Now many data sets come from large institutional programs • Many datasets are so large that publishing them in a normal journal (even an electronic journal) is not feasible • In many cases, there may be several versions of the same dataset Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
Trusting Provided Data Three main components to ensuring the integrity of data: • The data must demonstrate scientific integrity • The data repository must be trustworthy. • The data must not have been altered since creation (or any alterations have been well described) Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
Suggestions for Institutional Data Providers • Follow the OAIS Reference Model • Implement a method to detect any corruption of the data be it intentional or inadvertent (fixity from the OAIS model) • Institute peer-review of datasets Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
Suggestions for Investigators • Publish small datasets • Ensure large datasets are transferred to a data center Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
Suggestions for Investigators (continued) • Use data citations to reference institutional datasets used • What is a data citation? • A mechanism to properly credit the creator of a data set • A mechanism to credit the publisher of the the data set • A mechanism to allow your readers to find the data you used in your paper Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
Suggestions for Investigators (continued) • What do they look like? • Like a book or paper reference (see examples below) • Hall, D.K., G.A. Riggs, and V.V. Salomonson. 2000, updated daily. MODIS/Terra Snow Cover 5-Min L2 Swath 500m V004, September - December 2003. Boulder, CO, USA: National Snow and Ice Data Center. Digital media. • Armstrong, R., J. Francis, J. Key, J. Maslanik, T. Scambos, and A. Schweiger. 1998. Polar Pathfinder sampler: Combined AVHRR, SMMR- SSM/I, and TOVS time series and full-resolution samples. Compiled by S. Khalsa. Boulder, CO, U.S.A.: National Snow and Ice Data Center. CD-ROM. • If Digital Object Identifiers(DOI’s) are available for the data, they should be included in the citation. Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
For More Information • About NSIDC in general • http://nsidc.org • nsidc@nsidc.org • About data management or archiving at NSIDC • rduerr@nsidc.org • (303) 735-0136 Science Data Management and Preservation: Implications for the Ethos of Science Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO