400 likes | 519 Views
Data Management Planning. Jake Carlson Purdue University Libraries. Ron Nakao Stanford University Libraries. What will be Covered. A n introduction to terms and concepts. An understanding of the purpose of data management planning.
E N D
Data Management Planning Jake Carlson Purdue University Libraries Ron Nakao Stanford University Libraries
What will be Covered • An introduction to terms and concepts. • An understanding of the purpose of data management planning. • Coverage of some of the elements of data management planning and how they may relate to each other. • Case studies from Purdue and Stanford.
What is Data Management? “In the context of research and scholarship, "Data Management" refers to the storage, access and preservation of data produced from a given investigation. Data management is practices through the entire lifecycle of the data…” • Texas A&M, Research Data Management Lib Guide http://guides.library.tamu.edu/DataManagement
What is a DMP? • A formal document. • Describes: • what data will be produced • how each type of data will be managed • how each type of data will be shared • how each type of data will be archived • who will take responsibility for these actions • DMP Resources and Examples: http://www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/resources.html
DMP Requirement (NSF) • Data - samples, physical collections, software, curriculum materials, and other materials; • Standards - for data and metadata formats and content; • Policies for access and sharing – incl. IP, protection of privacy/confidentiality, security, etc.; • Policies for re-use – including provisions for re-distribution, and the production of derivatives; • Archiving - data, samples, and other research products, and for preservation of access. http://www.nsf.gov/bfa/dias/policy/dmp.jsp
DMP Tool https://dmp.cdlib.org/
Why Manage Data? • Because you have to: • Meet grant requirements • Because you want to: • Increase the visibility of your research • Simplify your life / Save time • Protect yourself http://libraries.mit.edu/guides/subjects/data-management/why.html
Effective Data Management Planning • Is a process, not an event • Probably requires more thought than it is given in developing the grant • Probably requires more than 2 pages • Should be informed by disciplinary and local cultures and environments • Should be driven by goals and objectives • Must be implemented to be successful
Other DMP Elements (ICPSR) • Responsibility - who does what, when? • Audience – identifying the potential secondary users of the data • Selection and retention periods – • what criteria will be used? • how long will data be retained and/or archived? • when will data be transferred to a 3rd party for curation? • Quality Assurance • Ethics & Legal Requirements • Budget & Financial Aspects http://www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/elements.html
DMP Purpose Preparing Data for Sharing Data Collection & File Creation Proposal Development & DMPs Depositing Data Data Analysis Project Start-Up
Guidance Across the Lifecycle Preparing Data for Sharing > Address disclosure risk limitation > Determine file formats to deposit > Contact archive for advice
Case Study on Data Management Planning
Libraries sponsored research center. • Established in 2006 to focus on issues associated with curating data sets for present and future research use. • Working in partnership with domain scientists and IT personnel to address the real world data needs of a research community.
Background Research • “Unpacking” the NSF requirements • Review of the content of existing data management plans • Review of existing guides on creating a DMP • Review of the information gathered from our Data Curation Profiles work, and other faculty-librarian collaborations
The Data Curation Profile is not designed to produce a Data Management Plan, however it could be used as a foundation to develop a more specific tool IASSIST 2011
Interviews • Working with OVPR, four proposals were selected: • Engineering Education • Agronomy • Physics / Electrical & Computer Engineering • Pharmacy • Interviews are conducted: • Multiple faculty / Multiple interviews • Sponsored Programs personnel and Subject Librarians also attend interviews Carlson IASSIST 2011
Challenges • Metadata & Preservation • Hard for researchers to define, or their understanding may not be fully accurate. • Archive = an old copy and/or a back-up copy • Generally outside researcher’s current practices. • Disciplinary standards or solutions may not be known, or may not exist.
DMP Self-Assessment Questionnaire http://purl.lib.purdue.edu/d2c2/dmp_saq
Guides IASSIST 2011
PURR http://purr.purdue.edu
Nano HUB http://nanohub.org/
Publishing & Curation Supporting Docs Versions Reviews Questions • Abstract • Cite this Work • Tags • Citations
Stanford Case Study • Stanford Data Management Services • Faculty collaboration example (HCMST) • Stanford Digital Repository (SDR)
Plan • Determine Funder Requirement • DMPTool list • Preparation • Create a Data Management Plan • DMPTool • Decide How to Share • Licensing (CC, ODC) • Other Issues (IP, IRB)
Manage • Organize Your Data • Names, Formats, Metadata, Versioning, Documentation, Knowledge Transfer Plan • Back Up Your Data • Storage, Backup & Recovery Services • Acquire & Analyze Data • Social Science Data, Geographical Data
Preserve • Select Data for Archiving • Questions to consider • Assign Metadata • Deposit Data in a Repository • Stanford Digital Repository (SDR) • Subject-Specific Repositories
Case Study • Collaborating with Professor Michael Rosenfeld on Data Management Plan & Its Implementation • DMP (later in Exercise) • “Painless” creation of Metadata • Quick turnaround for public data sharing • <data.stanford.edu> • Long-term Preservation • ICPSR • Stanford Digital Repository (SDR)
<data.stanford.edu> Metadata Title Citation Abstract, Principal Investigator, Funding Agency, Bibliographic Citation, Contact Email Description Introduction, Acknowledgements Methodology Universe, Unit of Analysis, Type of data collection, Time span, Time of data collection, Geographic coverage, Smallest geographic unit, Sample description, Sample response rate, Weights Documentation Document file(s), Web site or document download link(s) Data Download Link(s) Data file(s) Notes Errata, Data Notes News News Coverage
Lessons from Case Study • Quick development, enhancement, and data availability (Drupal) • Active PI involvement & metadata creation • Ownership & “freshness” of PI’s data page • Easy referral by PI (customized URL), usage stats, and contact lists provided ongoing value for PI
Stanford Digital Repository (SDR) • The SDR is a service supporting long-term management of scholarly information resources at Stanford. • Deposit in the SDR enables faculty, students, researchers to promote and protect the products of their work. • Librarians use the SDR to preserve and share scholarly collections of enduring value to the larger Stanford community. • Through robust preservation and security measures, the repository maintains appropriate access to deposited content from persistent web links while protecting against data loss and corruption.
Stanford’s Digital Library Infrastructure Diagram courtesy of Hannah Frost, Services Manager, Stanford Digital Repository
Thanks! Any Questions?