210 likes | 338 Views
Data Management for the International Polar Year. World Data Center for Glaciology, Boulder. Facilitating the international exchange of snow and ice data. Mark A. Parsons IPY Data Policy and Management Sub-committee IPY Data and Information Service Electronic Geophysical Year.
E N D
Data Management for the International Polar Year World Data Center for Glaciology, Boulder Facilitating the international exchange of snow and ice data Mark A. Parsons IPY Data Policy and Management Sub-committeeIPY Data and Information ServiceElectronic Geophysical Year IPY Discussion Forum Copenhagen, Denmark 13 November 2005
IPY1 IPY2 ? IPY4 IGY (IPY3)
? IPY4 What will IPY4 bring? • Will you be able to find all the data relevant to your research and see relationships between data sets. • Will you be able to merge and integrate different data sets across experiments and disciplines? • Will you be able to subset, visualize, and transform the data? • Will you be able to retrieve and understand IPY4 data in 2050? etc. Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
Organization of IPY Data Management Data Policy & Management Subcommittee • scientists • data managers • funding agencies IPY Joint Committee eGY Programme Office Data & Information Service Users Projects Data Centers, Virtual Observatories, etc. Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
Systems and Innovation Succeeded “Challenged” Failed The Standish Group’s “CHAOS report”. An assessment of 40,000 IT application projects Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
Organization of IPY Data Management Data Policy & Management Subcommittee • scientists • data managers • funding agencies IPY Joint Committee eGY Programme Office Data & Information Service Users Projects Data Centers, Virtual Observatories, etc. Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
DIS? Alternate Views of the DIS DIS? Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
Must Do: • Require rigorous data management plans • Determine archive and identify data management point of contact within project • Document well and often • Negotiate roles, responsibilities, and milestones with archive and DIS • Make data freely available* • Ensure appropriate data attribution and ownership • Ensure long-term preservation and access including non-digital data. Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
Should do: • Identify relevant historical data and data from other projects and make appropriate arrangements • Make data “interoperable” through standard formats, transfer mechanisms, descriptions—build coalitions • Facilitate model assimilation • Develop high-level outreach products Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
Open Questions and Issues • How interoperable do you want to be? What does “portal” mean to you? • Integration requires work by data provider • How does IPY data fit into current operational systems? • What about GEOSS—can we be a prototype? • New data vs. old data • Standards are essential, but which ones? (ISO19115, OAIS, OGC…) • Tech trends that can help us (XML (GML), ontologies, portals, etc.) • What do you think about the data policy? • Need a solid business model esp. for the long-term Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
We welcome your feedback! Taco de Bruin bruin@nioz.nl Mark A. Parsonsparsonsm@nsidc.org
Data Management Considerations or Themes • Manage technical innovation • Systems need people • Scientists and data managers working together • Preservation and Access—Two peas in a pod • The nature of the documentation • The nature of the data Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
The People Part “A striking proportion of project difficulties stem from people in both customer and supplier organisations failing to implement known best practice.” Service counts. — Oxford University/Computer Weekly survey of public and private sector IT projects (emphasis added) However, people are much more able to adapt to change, uncertainty, and messy systems Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
The People Part: Science and Data Management • Many have stated the need to involve scientists in data management, but… • It is also important to involve data managers in conducting science. • Field Experiments: • 20% increase in data quality (Parsons, et al. 2004) • 70% of experiment cost is data collection (Longley, et al. 2001) • Observing systems Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
Preservation and Access—Two Peas in a Pod Scientific Data Stewardship: • “preservation and responsive supply of reliable and comprehensive data, products, and information for use in building new knowledge to…” —USGCRP, 1998 • “the long-term preservation of the scientific integrity, monitoring and improving the quality, and the extraction of further knowledge from the data” — H. Diamond et al., NOAA/NESDIS, 2003 Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
Access. What is it? • Preservation requirements are well defined in the Open Archive Information System (OAIS) Reference Model, but • No similar model for access requirements — eGY could help • Not even a common definition of “access” and what restricts it • Unique access requirements for social science data and non-digital collections (physical samples, photographs, audio, etc.) Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
Use existing standards, e.g. ISO19115 metadata standard OAIS Reference Model Describe uncertainty Challenge your assumptions Documentation “We must not … start from any and every accepted opinion, but only from those we have defined — those accepted by our judges or by those whose authority they recognize.” —Aristotle c. 350 BC Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
011000101001001111010111000111101100101010001110011100101010011101010100111000110101101000010000100101001001010110010010001010100100100101010101001010100101001010100000111110010110101010110100010111101011010110101010011000101001001111010111000111101100101010001110011100101010011101010100111000110101101000010000100101001001010110010010001010100100100101010101001010100101001010100000111110010110101010110100010111101011011000101001001111010111000111101100101010001110011100101010011101010100111000110101101000010000100101001001010110010010001010100100100101010101001010100101001010100000111110010110101010110100010111101011010110101010011000101001001111010111000111101100101010001110011100101010011101010100111000110101101000010000100101001001010110010010001010100100100101010101001010100101001010100000111110010110101010110100010111101011 The Data Itself Formats: • Archives and users may have different needs • Consider four themes (Raymond, 2004) • Transparency • Interoperability • Extensibility • Storage or transaction economy Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005
Data Management Principles (bumper stickers) Preservation without access is pointless; access without preservation is impossible. It’s about DATA not systems Involve scientists in data management & data managers in science Think about long-term archiving NOW! Document uncertainty! Keep things simple & flexible Consider the needs of current, future, and unknown users
What’s Next? • The Data and Information Service should be created soon. • The Data Sub-Committee needs to consider these themes and principles when developing the IPY data policy. Data Management for IPY; Mark A. Parsons, IPY Discussion Forum, 13 November 2005