140 likes | 254 Views
Data Panel: Do you know where your data is? ASLI, 21 January 2010. Steven Worley Bob Dattore National Center for Atmospheric Research. AMS Ad Hoc Committee on Data Stewardship Prospectus, August 2009. Statement of Need # 4
E N D
Data Panel: Do you know where your data is?ASLI, 21 January 2010 Steven Worley Bob Dattore National Center for Atmospheric Research
AMS Ad Hoc Committee on Data Stewardship Prospectus, August 2009 Statement of Need # 4 “Develop a plan for citing data referenced in publications and preserving data links for the long term”
Committee continued - Problems • Data not traceably cited or even available • Publishers do not have rigorous process to handle references held at data centers • Data providers have not created and adopted a standard reference coding system • DOI’s – e.g. ORNL • International DOI Foundation – initiative of German National Library of Science and Technology • CLADDIER (Citation, Location, and Deposition in Discipline & Institutional Repositories) – project @ BADC • Any data reference scheme will fail if the data are not publicly available and are not in a long-term archive
Committee continued - Ramifications • Without citation that accurately defines the data used, researchers cannot validate or easily advance understanding starting from a publication • This can slow scientific discovery and degrade published assertions as fact
Committee continued - Recommendations • Collaboration • Establish a process whereby librarians, publishers, and AMS editorial boards are teamed with data providers and data centers, and organizations already addressing this challenge (e.g. AGU, Oak Ridge National Laboratory, etc.), to develop standard schemes for referencing data in publications
Committee continued - Recommendations • Set Policy • Institute a publication policy for data stewardship in the journals by defining recommendations for authors and setting a peer review criteria that focus on the adequacy of the data references
Committee continued - Recommendations • Awareness and Recognition • Use AMS statements or guidelines to emphasize the importance of stewardship and establish ways to recognize scientists that produce publicly-valued data and follow the guidelines
Challenges @ archive centers, E.G. NCAR • Get agreement/approval organization-wide for alpha-numeric coding • More than five data groups serve data • Need coordination with library • Coordinate with publishers • Align with other organizations • Establish data persistence policy / requirements • Minimum time period for data preservation
Challenges @ data centers • Build organization-wide mapping of citation tags (e.g. doi) to URL addresses • URL’s are too fragile, may change over time, but citation tag must remain immutable.
Data Evolution • Critical difference between data and traditional publications - data collections have a life cycle • New “improved” versions can be created • Easy case • Corrections small and large are made • Time and space domains can be appended • Metadata grows more comprehensive with usage/feedback, evaluations, publications
Data Evolution • How and when should data citation tags change? Answering the versioning question. • Absolute policy – impossible? • Sensible guidelines @ organizations • Across organizations? • Superseded versions cannot disappear • Once it is cited it must remain available • Need libraries capable for monitoring citations • Archives need authoritative opinion before taking action on “out-dated” versions.
Thoughts • We need a data sharing movement – three pronged effort. • Funding agencies make data sharing a reviewable criteria - augmented with follow up monitoring • Archive centers put immutable tags on long-term datasets • Publishers accept articles only if reviewed as to having adequate data citations
Thoughts • Publishers need to accommodate “data papers” – not a new idea. • Benefits • Credit data providers for career track advances • Foundation for monitoring usage of data • Informs users of what is available • Kick starts the data citation process
References: • SCOR/IODE Workshop on Data Publishing, Oostende, Belgium, 17-19 June 2008. Paris, UNESCO, 23pp. 2008. (IOC Workshop Report No. 207) • Lowry, R. and P. Pissierssens, A New Approach to Data Publication in Ocean Sciences, EOS, Vol. 90, Number 50, 15 December 2009 • Policy on Referencing Data in and Archiving Data for AGU Publications, http://www.agu.org/pubs/authors/policies/data_policy.shtml • Cook, R.B., Citations to Published Data Sets, Fluxletter, Vol. 1, Number 4, December 2008, http://bwc.berkeley.edu/FluxLetter/FluxLetter-Vol1-No4.pdf • How to cite ORNL DAAC products, Citation Style, http://daac.ornl.gov/citation_style.html • ORNL, doi:10.3334/ORNLDAAC/547 • http://www.earth-system-science-data.net/home.html