1 / 33

April 6, 2017 – NADDI Conference, Cornell University

Capturing Metadata Early In The Research Data Lifecycle. Barry T. Radler, PhD University of Wisconsin-Madison Institute on Aging. April 6, 2017 – NADDI Conference, Cornell University. Overview. Background The Ideal Capture (variable-level) metadata earlier in lifecycle DDI in Theory

bary
Download Presentation

April 6, 2017 – NADDI Conference, Cornell University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Capturing Metadata Early In The Research Data Lifecycle Barry T. Radler, PhD University of Wisconsin-Madison Institute on Aging April 6, 2017 – NADDI Conference, Cornell University

  2. Overview • Background • The Ideal • Capture (variable-level) metadata earlier in lifecycle • DDI in Theory • Survey Metadata Capture in Practice • UW Survey Center DDI-Word instrument template • Conclusions

  3. Background

  4. Background • MIDUS • Longitudinal multi-disciplinary study of health/well-being • Complex amount of data • Wide secondary usage through ICPSR • DDI facilitates wide use • MIDUS DDI Portal – http://midus.colectica.org

  5. The Ideal

  6. Metadata driven data capture: EDDI 2016 presentations • Archivist & Mapper: Simplifying and Modernising Questionnaire Entry - Will Poynter • Questionnaire Generator- Guillaume Duffes • Rich Metadata from the Start - Oliver Hopt • The DASISH Questionnaire Design and Documentation Tool – Functionalities and Examples from the Tool- Benjamin Beuster, Hilde Orten • Question Banks, Reusability, and DDI 3.2 - Dan Smith • Steps towards a Single Point of Access for Survey Questions across Europe: The Euro Question Bank Project - Wolfgang Zenk-Möltgen, Azadeh Mahmoud Hashemi • Document Questionnaires and Datasets with DDI: A Hands-On Introduction with Colectica- Jeremy Iverson, Dan Smith

  7. Capturing Metadata Earlier in Lifecycle “Every activity in the data life cycle should be documented as it occurs from conceptualization to publication.” – DDI Long-term Infrastructure Manifesto (forthcoming) DDI 3 “Lifecycle”

  8. Leveraging Metadata Earlier in Lifecycle • Capture study and instrument design metadata—once—at time of occurrence or creation • More efficient and easier to capture information about the research workflow at the time of its occurrence rather than after the fact • Metadata capture not realized at time of occurrence or creation leads to information loss • Potentially employ metadata to drive survey administration

  9. Increased Efficiency of Metadata Production

  10. Data Documentation Initiative in Theory

  11. The Data Documentation Initiative (DDI) is an international standard for describing the data produced by surveys and other observational methods in the social, behavioral, economic, and health sciences. • DDI is a free standard that can document and manage different stages in the research data lifecycle, such as conceptualization, collection, processing, distribution, discovery, and archiving. • Documenting data with DDI facilitates understanding, interpretation, and use -- by people, software systems, and computer networks.

  12. Advantages of DDI: • Introduces a common communication protocol to research processes • Increases transparency across systems and software • Interoperates with other standards such as DataCite and Dublin Core • A free and open standard (XML) • Advantages of XML: • Is interoperable; not concerned with any particular OS • Widely used data exchange standard • No licenses or usage requirements • Easily transformed into presentation languages such as HTML, PDF or plain text.

  13. DDI: One Document, Many Uses

  14. Metadata driven research reports • “The Sponsorship on Quality recommended that quality reporting should be streamlined and rationalised across the ESS, by using the existing metadata systems and by creating a “once for all purposes” reporting strategy.“

  15. Metadata Capture in Practice

  16. Challenges to adopting/using DDI • Complexity • DDI 3.2: 1,100 tags • Documentation and training • Low level of researcher buy-in • More appealing to large organizations, official statistics • Need for tools • Lower entry barriers • Utilitarian tools for reuse, not one-off • Organizational resistance to changes in workflow

  17. UWSC: A MS Word template

  18. UWSC experience • Goal: • Documentation standard that produces one source document that can be reused through lifecycle • Create authoring tool that clients are familiar with (Word) • Current CAI: CASES • Computer-Assisted Survey Execution System • DDI2 compliant • Isolated from other lifecycle stages

  19. WordTemplate

  20. PDF version

  21. Web version

  22. CASES version

  23. UWSC experience • Obstacles: • Describe how an instrument: • Behaves(instrument logic and variable metadata) • Looks (layout, display, graphics) • Especially useful for mixed mode surveys • DDI is limited in documenting display issues for production • Can reference external content (URLs)

  24. Metadata and survey mode “One important finding, which was not part of the original remit of this investigation, is awareness of how much harder it is to include in the study documentation a questionnaire that has been developed for collecting data on an electronic device rather than on paper. HDSS, which moved to electronic data collection using specialist software like CSPro, need to be aware that for documentation purposes they need to develop paper versions of the questionnaire for explanatory purposes, or supply the code and its interpretation (e.g., as screen shots) as part of the documentation package.” ChifundoKanjala, Jim Todd, David Beckles, Tito Castillo, Gareth Knight, BaltazarMtenga, Mark Urassa, and BasiaZaba. (2016). Open-access for existing LMIC demographic surveillance data using DDI. IASSIST Quarterly, Summer.

  25. UWSC experience • Obstacles: • Whose metadata is important? • Different types/forms of metadata • Producers • Users

  26. Different actors, different metadata needs • Two stakeholders with competing interests: • The data collector (producer/designer) wants to document the project management processes involved from conceptualization to fielding of final instrument. • The client (user/analyst) wants to document the results produced by the final instrument and any fielding occurrences that can affect the interpretation of those results.

  27. Different actors, different metadata needs From SIMS report: “Only a certain level of detail and only some of the quality concepts are of interest to the general users of European statistics who are mainly interested in the statistical outputs. On the other hand, all detailed quality concepts (up to the lowest level of detail) are of interest to the producers of European statistics who are also interested in the statistical production processes. Some of the concepts are of interest to both groups.”

  28. Conclusions

  29. Capturing metadata early - Conclusions • Capturing metadata early in the research data lifecycle • One DDI document → repurposed for multiple uses • Reduce redundancy and information loss • Technical issues • Across different platforms and systems • Instrument behavior and display across modes of administration • Non-technical issues • Distinct and non-overlapping metadata needs • Within organizations and across different stakeholders • Study-level metadata not as problematic as variable-level? • AAPOR Transparency Initiative

  30. DDI-Word template later in data lifecycle • Study-level Metadata • Objectives, population, sampling, methodology, funding or client identifiers, response rates, disposition codes, quality reports, weighting specs. • Fewer items, changes, display issues • Fewer technical and personnel obstacles • AAPOR Transparency Initiative • Designed to promote methodological disclosure  • Develop simple and efficient means for routinely disclosing research methods by identifying common disclosure elements

  31. Special Thanks to UWSC Programmers:Eric WhiteBrendan Day April 6, 2017 – NADDI Conference, Cornell University

  32. Thank you!bradler@wisc.edu This presentation is offered under license CC BY-SA 4.0 April 6, 2017 – NADDI Conference, Cornell University

More Related