140 likes | 261 Views
Data in the NEES Data Repository. Conditions for Current and Future Use and Re-Use. Stanislav Pej ša NEEScomm D ata C urator , NEES. Quake Summit 2012, Boston, Massachusetts July 12, 2012. This work is licensed under a Creative Commons Attribution 3.0 Unported License. Table of Contents.
E N D
Data in the NEES Data Repository Conditions for Current and Future Use and Re-Use Stanislav Pejša NEEScomm Data Curator, NEES Quake Summit 2012, Boston, MassachusettsJuly 12, 2012 This work is licensed under a Creative Commons Attribution 3.0 Unported License.
Table of Contents • Reference models • Data Flow in NEES Data Repository • NEES Data Goals • Data Archiving • Quality Assurance • Access and Sharing • Data Re-Use • Data Preservation
I2S2 Research Lifecycle • Research activity • Administrative activity • Publication • Archive activity http://www.ukoln.ac.uk/projects/I2S2/documents/I2S2-ResearchActivityLifecycleModel-110407.pdf
DCC Curation Lifecycle http://www.dcc.ac.uk/resources/curation-lifecycle-model
OAIS Functional Model 6 functional entities • Ingest • Archival storage • Data management • Administration • Preservation planning • Access
Data Flow in NEES http://nees.org/warehouse/experiment/1622/project/637
NEES Data Goals Aligned with NSF Data Management Plan (DMP)requirements* • All research data and documentation will be archived Types of data and other materials to be produced during project • Archived data will be of high quality Standards to be used for data and metadata format and content • Archived data will be accessible and shareable Policies for access and sharing • Archived data will be re-usable Policies and provisions for re-use • Archived data will be preserved Plans for archiving and preservation of access to them * http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/gpg_2.jsp#dmp
Data Archiving • Who • research team, site personnel, curator, NEEScomm • What • sensor measurements, sensor calibrations, observations, analyses, numerical simulations, images and videos, reports (including publications and presentations), logs • When • Dates are stated in the Data Sharing and Archiving Policies (1 month, 6 moths, 12 months) • For as long as the data are useful ~ indefinitely ~ for 20 years • Where • Project Warehouse http://nees.org/warehouse/welcome • Why • increases researcher’s impact • saves work, time, money • good practice • advances science
Information Package Information Package – discoverable through descriptive information • Content Information - the original target of preservation - consists of • Content Data Object (bits) • Representation Information – needed to make object understandable to the community (record) • Preservation Description Information - information needed to preserve the Content Information • Provenance • Context • Reference (Identification) • Fixity – protect the CI from undocumented alteration • Access rights
Quality Assurance Data need to be understandable • Standards Seeing standards • Research teams • Professional standards • Team guidelines for data management • NEEScomm requirements • NEES Sites • Certifications • Professional standards • Local guidelines (naming conventions, etc.) • NEES Data Repository • OAIS • PREMIS • Dublin Core • Documentation and metadata requirements • Curation • interactive and iterative exchange • assessment of technical quality of data and relevant documentation
Access and Sharing • Time • Unprocessed data – within 1 month • Corrected data and documentation – within 6 months • Data made PUBLIC within 12 months • Conditions for access and sharing (Let others know that they can use your data) • Open Data data • Creative Commons presentations, reports, pre-prints/post-prints, teaching materials • Open Source software more on intellectual property considerations https://nees.org/legal/licensing
Data Re-Use Use of known, tested, and open formats is key to the success of any future attempt to use data • Data Use -Using research data for the current research purpose/activity to infer new knowledge about the research subject. • Data Re-use - Using research data for a research purpose/activity other than that for which it was intended. • Data Purposing - Making research data available and fit for the current research activity. • Data Re-purposing -Making existing research data available and fit for a future known research activity. • Supporting Data Re-use - Managing existing research data such that it will be available for a future unknown research activity. Darlington, M. (ed.) (2011a) "ERIM Terminology", version 4. University of Bath, last updated April 12, 2011 http://wiki.bath.ac.uk/display/ERIMterminology/ERIM+Terminology+V4 Ball, A., Darlington, M, Howard, T., McMahon, Chris, Culley, S. (2012). Visualizing Research Data Records for their Better Management. Journal of Digital Information, Vol 13, No 1. Available at http://journals.tdl.org/jodi/article/view/5917
Preservation • Bit-level preservation All files will be stored and preserved on the bit-level • Full preservation Required and recommended (supported) formats Preservation strategies: • format migration • format refresh • emulation
Thank you! Questions? Comments? Standa Pejša - spejsa@purdue.edu