1 / 33

FAIR data in practice: From FAIRy tale to FAIR enough

Learn about implementing FAIR principles in data repositories, challenges faced, self-assessment tools, and future initiatives. Understand metrics, feedback, and importance of FAIRness criteria.

duncant
Download Presentation

FAIR data in practice: From FAIRy tale to FAIR enough

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FAIR data in practice: From FAIRy tale to FAIR enough Peter Doorn, Eliane Fankhauser, Mustapha Mokrane Webinar, 11 December 2018 Twitter: @pkdoorn @MokraneMA@DANSKNAW

  2. Who we are FAIR data in practice: From FAIRy tale to FAIR enough Peter Doorn Eliane Fankhauser Peter Doorn, Eliane Fankhauser, Mustapha Mokrane Webinar, 11 December 2018 Mustapha Mokrane

  3. FAIR Data in Trustworthy Data Repositories: Everybody wants to play FAIR, but how do we put the principles into practice? Once upon a time, two years ago.... Peter Doorn, Director DANS Ingrid Dillo, Deputy Director DANS EUDAT/OpenAIRE webinar, 12-13 December 2016 https://eudat.eu/events/webinar/fair-data-in-trustworthy-data-repositories-webinar

  4. From the previous episode.... FAIR Badging scheme F A I R https://www.surveymonkey.com/r/fairdat

  5. What have we done since? • Test prototype FAIRdat within DANS, within 4 other repositories, and at Open Science FAIR in Athens • Participate in FAIR metrics group: see http://fairmetrics.org/ • 14 metrics on GitHub: https://github.com/FAIRMetrics/Metrics • Wilkinson, M. D. et al. ‘A design framework and exemplar metrics for FAIRness’. Sci. Data 5:180118 doi: 10.1038/sdata.2018.118 (2018) • Evaluate DANS archive against FAIR metrics

  6. Testing the FAIRdat prototype Test in 4 repositories, summer 2017 Test at Open science Fair, Athens 2017 17 participants + tests within DANS

  7. Pros and Cons of FAIRdat prototype Pros, positive feedback Simple/easy to use questionnaire Well-documented Useful Cons, negative feedback Questionnaire oversimplified? Some requirements of Reusability missing/shifted Other observations Variances in FAIR scores across multiple reviewers due to subjectivity Some like starring datasets, others not (should open data score higher than closed data?) Assessing multi-file data sets with different properties FAIR Metrics: STARRING YOUR DATA F A I R

  8. Other challenges Subjectivity in assessment of principles • F2 “rich metadata” • I1 “broadly applicable language for knowledge representation” • R1 “plurality of attributes” • R1.2 “detailed provenance” • R1.3. “domain relevant community standards” • Use of standard vocabularies: how to define? Misunderstandings of question/meaning of principle Most FAIR metrics can be measured at the level of the repository Slide credits: Eleftheria Tsoupra

  9. (Self) assessment of DANS archive on the basis of the FAIR principles (& metrics) Delft University: DANS EASY complies with 11 out of 15 principles, for 2 DANS does not comply (I2 & R1.2), for 2 more it is unclear (A2 & R1.3) Self assessment: Some metrics: FAIRness of DANS archive could be improved • E.g.: Machine accessibility; Interoperability requirements; Use of standard vocabularies; Provenance Some metrics: we are not sure how to apply them • E.g.: PID resolves to landing page (metadata), not to dataset; Dataset may consist of multiple files without standard PID Sometimes the FAIR principle itself is not clear • E.g.: Principle applies to both data and metadata; What does interoperability mean for images or PDFs? Are some data types intrinsically UNFAIR? Some terms are inherently subjective (plurality, richly)

  10. What are we working on now? A fork in the road ahead Eliane: FAIR enough? A checklist for researchers to evaluate the FAIRness of data(sets) Mustapha: CoreTrustSealEnabling FAIR Data Repositories

  11. FAIR enough? A checklist for researchers to evaluate the FAIRness of data(sets) Eliane Fankhauser Project Leader DANS

  12. FAIR assessment tools: An overview Checklist “How FAIR are your data? • “A Checklist produced for use at the EUDAT summer school to discuss how FAIR the participant's research data were…” FAIR self-assessment tool • Provided by ANDS / Nectar / RDS • ”… designed predominantly for data librarians and IT staff…” FAIRdat tool • “Using this tool you will be able to score the 'FAIRness' of a dataset.” Checklist for Evaluation of Dataset Fitness for Use • Provided by RDA Working Group “Assessment of Data Fitness for Use” • “This checklist is meant to supplement the CoreTrustSeal Repository Certification process.” • Not yet available, for more information see here FAIR enough? Checklist to evaluate FAIRness of data(sets) • Provided by DANS

  13. FAIR checklist for researchers • Short and concise checklist for researchers who are planning to deposit their data • Covers different levels of FAIRness (repository, metadata, dataset, files) • Embraces two core concepts • FAIR data • Trustworthy repository • Current state: beta version (Google Forms)

  14. Checklist demonstration

  15. Summary • Questions formulated as simple as possible • No direct ”translation” of FAIR principles • Short explanations of terms and concepts • Reference to trustworthy repositories and CTS • Overall score at the end • “Recommendations” for questions answered with no

  16. The FAIR checklist for researchers online: https://dans.knaw.nl/nl/projecten

  17. FAIR checklist factsheet (draft version)

  18. CoreTrustSeal—Enabling FAIR Data Repositories Mustapha Mokrane, Consultant at DANS, and CoreTrustSeal Board

  19. “Research data will not become nor stay FAIR by magic. We need skilled people, transparent processes, interoperable technologies and collaboration to build, operate and maintain research data infrastructures.” Mari Kleemola, Finnish Social Science Data Archive, Finland CoreTrustSeal Board, Secretary https://tietoarkistoblogi.blogspot.com/2018/11/being-trustworthy-and-fair.html

  20. FAIR GUIDING PRINCIPLES • Focus: Enable discovery and reuse of data • Process: Data management & stewardship

  21. FAIR RESEARCH DATA LIFECYCLE

  22. FAIR RESEARCH DATA LIFECYCLE Research Data Repositories

  23. FAIR GUIDING PRINCIPLES The components of a FAIR Ecosystem A model for FAIR Digital Objects Turning FAIR data into reality, Final report and Action Plan from the European Commission Expert Group on FAIR Datahttps://doi.org/10.2777/54599

  24. FAIR ASSESSMENT: FINDABILITY (META)DATA F1. (meta)data are assigned a globally unique and persistent identifier F2. data are described with rich metadata F3. metadata clearly and explicitly include the identifier of the data it describes DATA REPOSITORY F4. (meta)data are registered or indexed in a searchable resource • TECHNOLOGIES • PROCEDURES • EXPERTISE • PEOPLE

  25. FAIR ASSESSMENT: ACCESSIBILITY DATA REPOSITORY A1. (meta)data are retrievable by their identifier using a standardized communications protocol A1.1 the protocol is open, free, and universally implementable A1.2 the protocol allows for an authentication and authorization procedure, where necessary A2. metadata are accessible, even when the data are no longer available (META)DATA

  26. FAIR ASSESSMENT: INTEROPERABILITY (META)DATA I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. I2. (meta)data use vocabularies that follow FAIR principles I3. (meta)data include qualified references to other (meta)data DATA REPOSITORY • TECHNICAL INFRASTRUCTURE • PROCEDURES • EXPERTISE • PEOPLE

  27. FAIR ASSESSMENT: REUSABILITY (META)DATA R1. (meta)data are richly described with a plurality of accurate and relevant attributes R1.1. (meta)data are released with a clear and accessible data usage license R1.2. (meta)data are associated with detailed provenance DATA REPOSITORY R1.3. (meta)data meet domain-relevant community standards • TECHNICAL INFRASTRUCTURE • PROCEDURES • EXPERTISE • PEOPLE

  28. CORETRUSTSEAL ASSESSMENT Organizational Infrastructure Digital Object Management Technology CoreTrustSeal Data Repositories Requirements https://www.coretrustseal.org/why-certification/requirements/

  29. CORETRUSTSEAL— FAIR ALIGNMENT F • Offer persistent identifiers [F1 and F3] • Recommended data citations [F1] • Searchable metadata catalogue to appropriate standards [F2, F3] • Search facilities, inclusion in disciplinary or generic registries of resources [F4] R13 A • Facilitate machine harvesting of the metadata [A1] • Uses international and/or community standards [A1.1] • Searchable metadata catalogue to appropriate standards [A1 and A1.1] • Technical infrastructure: protection of facility, data, products, services, users [A1.2] • Data managed in compliance with discipline and ethical norms [A1.2] • Responsibility for long-term preservation [A2] R4 R10 R13 R15 R16 I • Metadata required when the data are provided[I1] • Formats used by the Designated Community[I1] • Measures and plans for the possible evolution and migration of formats [I2] • Ensure understandability of the data [I2] • Ability to comment on, and/or rate data and metadata [I3] • Provide citations to related works or links to citation indices [I3] R14 R11 R • Integrity and authenticity of the data [R1] • Documentation of the completeness of the data and metadata [R1] • Links to metadata and to other datasets [R1] • Provenance data and related audit trails [R1.2] • Maintains licenses covering data access and use and monitors compliance [R1.1] • Defined data and metadata: ensure relevance and understandability for users [R1.3] • Technical data and metadata quality and assessment of adherence to schema [R1.3] R2 R7 R8 R11

  30. FAIR ECOSYSTEM Rec. 20: Deposit in Trusted Digital Repositories Research data should be made available by means of Trusted Digital Repositories, and where possible in those with a mission and expertise to support a specific discipline or interdisciplinary research community. Rec. 9: Develop assessment frameworks to certify FAIR services Data services must be encouraged and supported to obtain certification, as frameworks to assess FAIR services emerge. Existing community-endorsed methods to assess data services, in particular CoreTrustSeal (CTS) for trusted digital repositories, should be used as a starting point to develop assessment frameworks for FAIR services. Repositories that steward data for a substantial period of time should be encouraged and supported to achieve CTS certification. Turning FAIR data into reality, Final report and Action Plan from the European Commission Expert Group on FAIR Data doi.org/10.2777/54599

  31. TAKE HOME MESSAGES • FAIR Principles apply to more than (meta)data • FAIR data assessments must include infrastructure • FAIR data live in Trustworthy Data Repositories • CoreTrustSeal Requirements are FAIR aligned

  32. To be continued... Work at DANS on FAIR is in full progress. In 2019 we are planning to continue our work in a European project to formulate FAIR rules of participation for the EOSC. Subjects to work on include: Strengthen certification of repositories for FAIR data FAIR data policies Assessment of FAIRness of data and metadata within certified repositories, focusing on those metrics that vary within such repositories FAIR software and services Training on FAIR data (management), both for students within regular academic curricula and for others

  33. Thank you for listening @pkdoorn Peter.Doorn@dans.knaw.nl Eliane.Fankhauser@dans.knaw.nl @MokraneMA Mustapha.Mokrane@dans.knaw.nl @DANSKNAW www.dans.knaw.nl

More Related