1 / 33

DataShare : Collaboration Yields Promising Tool

DataShare : Collaboration Yields Promising Tool. Julia Kochi, UCSF Library Angela Rizk-Jackson, UCSF CTSI Perry Willett, CDL CNI 2013 Meeting San Antonio, TX. The Background. Julia Kochi UCSF Library. What is DataShare ?. An open data repository for the UCSF researcher

keely
Download Presentation

DataShare : Collaboration Yields Promising Tool

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DataShare: Collaboration Yields Promising Tool Julia Kochi, UCSF Library Angela Rizk-Jackson, UCSF CTSI Perry Willett, CDL CNI 2013 Meeting San Antonio, TX

  2. The Background Julia Kochi UCSF Library

  3. What is DataShare? An open data repository for the UCSF researcher A concept initially envisioned by Michael Weiner, M.D. A collaboration between UCSF CTSI, UCSF Library, and the California Digital Library

  4. The Problem • Increasing requirements to share data • NIH grants >$500k • Publisher requirements • Unequal availability of national repositories • Campus priorities • FASTR, White House Directive

  5. The Partners • UCSF CTSI • Knowledge of the researcher, access to the data • UCSF Library • Metadata expertise, programming resources • UC3 • Preservations tools, services and expertise

  6. Technical Infrastructure Perry Willett California Digital Library

  7. DataShareComponents Merritt: CDL EZID: CDL XTF: CDL, UCSF Library Ingest tool: UCSF Library

  8. Merritt Repository Service Built on “micro-services” principles Content and format agnostic Has a UI and RESTful APIs to submit and retrieve content, and check statuses Can serve as either “dark” or “bright” archive Added public access, data use agreements, asynchronous downloads as part of Datashare project

  9. EZID Service for creation and management of long-term identifiers Currently supports ARKs and DOIs; other types in planning stages Registers DOIs with DataCite Has a UI and APIs with good documentation

  10. XTF • eXtensible Text Framework • Developed and maintained by CDL • Runs several CDL services: • eScholarship • Online Archive of California • Calisphere • Faceted browsing, full-text search, other desirable features

  11. Ingest tool • Submitting content to a digital repository is hard and costly • An attempt to simplify several aspects: • Digital object creation • Metadata creation • Object submission

  12. Interactions for submission Creates Metadata Datacite Assembles Dataset Packages object Submits to Merritt Registers DOI and Metadata Ingest Tool Requests DOI Merritt Submits Metadata to EZID Requests ATOM feed for collection Receives DOI Gets ATOM feed Retrieves Metadata XTF EZID Index metadata

  13. Process for Endusers Search, browse Request dataset download Fill out Data Use Agreement Receive dataset

  14. Lessons learned • Partnerships • Many hands make light work • Real users uncover hidden assumptions • Scale • Object size • Number of files • Upload and download

  15. If you build it, will they come? Angela Rizk-Jackson UCSF CTSI

  16. What will it take? + Sketch by Juliana Olivera Silva via Flickr

  17. Providing Incentives: Requirements

  18. Providing Incentives: Visibility 01010010101001100101001010100101010111101010111101010001010100010101000010011000 • Enhances collaborative opportunities • 69% increase in citation rate for publications associated with shared data (Piwowar, 2007)

  19. Providing Incentives: Credit

  20. Providing Incentives: Preservation & Access

  21. Providing Incentives: Institutional • Support researcher needs • Improved archiving efficiency • Cost savings UCLA Royce Hall photo courtesy of Adam Fagen via Flickr

  22. Eliminating Barriers • Time / Effort • Minimal requirements • Specific tools (e.g. ingest) • Integrate into existing workflow • Control • Data Use Agreement • Centralized service • Cultural Paradigm • Outreach • Demonstrate value

  23. Other Collaborators

  24. Lessons Learned • Don’t underestimate technical matters • Separating data & metadata • Standards are not standard • Metadata schema (Dublin Core  DataCite) • Interpretation • Policy issues are ever-present • Data Ownership & Data Use Agreements • Privacy & Consent (Human subjects) • Keep in mind the entire lifecycle: ALL users • Discoverability & interoperability • README File

  25. Next Steps • Outreach • System enhancements • Design overhaul • Ingest mechanism • DUA menu • Policy navigation • Proof-of-concept

  26. Discussion Topics • What incentives have you found useful to encourage adoption of this type of resource? • Are you using data use agreements? Uniform or individualized? • Where do you see institutional data repositories fitting in the larger ecosystem?

  27. More info • Datashare: http://datashare.ucsf.edu • CDL: http://www.cdlib.org • Merritt: https://merritt.cdlib.org • EZID: http://n2t.net/ezid • XTF: http://xtf.cdlib.org • UCSF Library: http://www.library.ucsf.edu/ • UCSF CTSI: http://ctsi.ucsf.edu/ NCATS – NIH Grant # UL1 TR000004

More Related