1 / 52

Data Curation Profiles Workshop (Sort of, not really)

Data Curation Profiles Workshop (Sort of, not really). Jake Carlson Data Services Specialist. Workshop hash tag: # dcptoolkit. Outline. What are Data Curation Profiles? Preparation Interviewing Building a Profile. The Data Curation Profile Toolkit is a means to determine:

zahir-hart
Download Presentation

Data Curation Profiles Workshop (Sort of, not really)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data CurationProfiles Workshop (Sort of, not really) Jake Carlson Data Services Specialist Workshop hash tag: #dcptoolkit

  2. Outline • What are Data Curation Profiles? • Preparation • Interviewing • Building a Profile

  3. The Data Curation Profile Toolkit is a means to determine: • Information about a particular data set and its lifecycle. • What a researcher is doing to manage / curate the data set. • What a researcher would like to do with the data.

  4. Characteristics of the DCP • Tells “the story” of the data • Focused on a specific data set – provides depth not breadth • Interview based • Meant to be “discipline neutral” and widely applicable to different types of data • Modular – allows for flexibility and tailoring to specific situations and uses

  5. Characteristics of the DCP • Represents the researcher’s needs and perspectives • A concise, structured document suitable for sharing and annotation. • A resource for Librarians, Archivists, IT Professionals, Data Managers, and others.

  6. DCP Sections • Information about a Data Set and its Context • Overview of the Research • Focus • Intended Audience • Funding • Data Kinds and Stages • Data Narrative (data lifecycle) • Target Data for Sharing • Use/re-use Value • Contextual Narrative

  7. What is a data set? • The data collected and analyzed for a specific project or problem • Primary = data generated, analyzed to achieve project results • Ancillary = additionaldata which furtheradds to project

  8. Define curation • Curation is the activity of managing and promoting the use of data, starting from the point of creation, to ensure its fitness for contemporary purposes and availability for discovery and re-use. • Archiving is a curation activity which ensures that data is properly selected and stored, can be easily accessed and that its logical and physical integrity is maintained over time. • Preservation is an archiving activity in which specific items of data are maintained over time so that they can still be accessed and understood through succession and obsolescence of technologies. Lord, Macdonald, Lyon & Giaretta (2004) "From data deluge to data curation." Proceedings of the UK e-Science All Hands Meeting 2004, 31st August - 3rd September, Nottingham UK.

  9. The “story” of a research data set • “Data from both studies in this project consist primarily of field data and plant samples. Variables gathered include the yield and overall health of the plot, the physical characteristics of the plant sample, and the amount of selected nutrients present in the sample.” • “The scientist studies real-time traffic signal performance measures in which he measures the movement of traffic, specifically the number of vehicles passing through an intersection and the amount of time they spend at an intersection on a movement-by-movement basis over a 24 hour period.”

  10. Understanding data lifecycle • Researchers talk about their data in different phases, stages or levels • Helpful to understand distinctions because • It is often how they identify their work • Helps in talk about process/methods • Sometimes differences determine what someone is willing/able to share • Only some data may be curated

  11. Data lifecycle General data lifecycle indicating stages or levels of data Humphrey, Charles. “e-Science and the Life Cycle of Research” (2006) Retrieved 4/20/10: http://datalib.library.ualberta.ca/~humphrey/lifecycle-science060308.doc

  12. Data lifecycle

  13. Understand / Assess Data Workflows

  14. More DCP Sections Information about Needs Tools Interoperability Measuring Impact Data Management Preservation • Intellectual Property • Organization and description of data • Ingest • Access • Discovery

  15. Components of the Data Curation Profile Toolkit

  16. The DCP Toolkit The Data Curation Profile Toolkit consists of 4 components: • User Guide • Interviewer’s Manual • Interview Worksheet • DCP Template Photo from: http://www.flickr.com/photos/neilt/2517652/sizes/m/in/photostream/

  17. The User Guide • Describes the rationale for the DCP • Describes the process through which a DCP is generated • Stage 1 – Preparation • Stage 2 – Worksheet & Interviews • Stage 3 – Constructing the Profile • Provides guidance & advice

  18. Interview Worksheet and Manual • Meant to be used in tandem • The Interview Worksheet is given to the researcher to fill out. • The Interviewer’s Manual contains follow up questions for the interviewer to ask once the researcher has filled out a module.

  19. Interviewing • Using the Interview Manual & Worksheet • Read any introductory statement listed in the “Interviewer’s Manual” (if any) • Then have the researcher complete the list of questions for the module in the “Interview Worksheet”. • Review the responses and ask the questions listed in the “Interviewer’s Manual” as appropriate. • Ask any follow up questions you feel are needed. • Move on to the next module.

  20. Data Curation Profile Template • Describes the structure of the Data Curation Profile • Each section or sub-section contains a brief definition of the information that is needed to populate a Data Curation Profile

  21. Connections Between Components Worksheet Manual Template

  22. Worksheet Mod.13 Q2 Worksheet

  23. Manual: Mod 13 Q2

  24. Connections Between Components Section 13.1 of the Completed Data Curation Profile

  25. How to Develop a DCP A Data Curation Profile is developed through 3 stages: • Stage 1 – Preparation • Stage 2 – Interviews • Stage 3 – Constructing the Profile

  26. Stage 1 – Preparing • Schedule the interview. • At time / place convenient for the researcher • Select the data set to be profiled. • This may be negotiated by them if they feel they have a project in particular that they want to discuss, or one that is “easier” to discuss • Be sure to let researcher know that you will be recording the interview.

  27. Stage 1 – Preparing • Sending out the “Interview Worksheet” in advance. • Pros and Cons… • Filling out the worksheet—we have found that people are not reluctant to fill them out, just that they either forget, or can’t find time • Ask for any materials that may help familiarize you with the selected data set. • An article, report, S.O.P., or other documentation

  28. Stage 1 – Preparing • Do your homework - Investigate researchers’ work and use of data • Faculty publications • Faculty’s website • Press release / News Article • Review of awarded grants

  29. Exercise Read the article and ID the data

  30. In-depth: Scheduling • Schedule time & place to meet—best if quiet, room to spread out (some offices are packed!) • Send a gentle reminder to fill out Worksheet, if you sent it beforehand. (If they don’t, try to get them to do so during interview, noting that notes will help in compiling Profile)

  31. Stage 1 – Preparing • Modifications to the DCP • How many of the profile modules do you want to include? • Time • Purpose • Core Modules • Are there additional modules or questions that you want to include? Photo from: http://www.flickr.com/photos/julia_manzerova/2190216162/sizes/m/in/photostream/

  32. Interviewing

  33. Stage 2 – Interviewing • Introduction to the Interview • The need for two interviews • Time required • Coverage Image from: http://www.flickr.com/photos/terryhart/2890904949/sizes/m/in/photostream/

  34. Stage 2 – Interviewing • Using the Interview Manual & Worksheet • Read any introductory statement listed in the “Interviewer’s Manual” (if any) • Then have the researcher complete the list of questions for the module in the “Interview Worksheet”. • Review the responses and ask the questions listed in the “Interviewer’s Manual” as appropriate. • Ask any follow up questions you feel are needed. • Move on to the next module.

  35. Stage 2 – Interviewing • Types of Questions: Worksheet • Free text • Short answer (text) • Selecting from a range of possible responses • Yes/No • Likert Scale Image from: http://www.flickr.com/photos/valeriebb/3006348550/sizes/m/in/photostream/

  36. Stage 2 – Interviewing • Types of Questions: Manual • Explanatory – “Tell me why you selected ‘x’ as your response” • Clarifying – “Could you explain what you mean by ‘x’?” • Probing – “Could you tell me more about ‘x’?” • Relational – “Could you tell me how ‘x’ relates to your earlier response of ‘y’?” • Summarization – “So ‘x’ leads to ‘y’ is that right? Then what happens?”

  37. Interview Tip Sheet

  38. Exercise Conducting the Interview • What was said about sharing the data set by the researcher? • What follow up questions would you ask?

  39. Tools of the Trade

  40. Transcribing the Audio

  41. Transcribing the Interview • Full:

  42. Transcribing the Interview • Indexed

  43. Modules and Sections of DCP

  44. Exercises • Compose a subsection of a Data Curation Profile from a completed interview worksheet and transcript.

  45. Exercise: Building a Profile 2.2 - Intended audiences The information needed to populate this sub-section will likely come from Module 3 – Sharing. This sub-section is meant to identify who the potential audiences for the data (not the research as a whole) are or might be according to the data client. The audience types listed may be specific (“Researchers studying the effects of climate change on plant growth during the Mesozoic era”) or broad (“Climate Change Researchers”) as dictated by the data client. Audience types may be those with whom the data client is currently sharing his or her data or audience types the data client imagines would be interested in this data.

  46. Data lifecycle General data lifecycle indicating stages or levels of data Humphrey, Charles. “e-Science and the Life Cycle of Research” (2006) Retrieved 4/20/10: http://datalib.library.ualberta.ca/~humphrey/lifecycle-science060308.doc

  47. Background: Data lifecycle (cont’d)

  48. In-depth: Data lifecycle (cont’d)

  49. Exercise: Data Lifecycle

More Related