1 / 13

IUPAC Chemistry Terminology and Metadata Initiatives for FAIR Data Publishing Workshop (April 2019)

This workshop focuses on associating chemistry metadata with DOIs and aims to develop guidelines for publishing FAIR chemical data. It will discuss the value proposition and coordinate stakeholder engagement for ongoing collaboration.

soriano
Download Presentation

IUPAC Chemistry Terminology and Metadata Initiatives for FAIR Data Publishing Workshop (April 2019)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chemistry Metadata Initiatives Metadata terms – IUPAC Terminologies Associating Chemistry Metadata with DOIs FAIR Chemistry Data Publishing Workshop (A Footnote on File Formats) April 2019

  2. IUPAC Chemical Terminology • Blue Book • Nomenclature of Organic Chemistry • Red Book • Nomenclature of Inorganic Chemistry • White Book • Biochemical Nomenclature • Orange Book • Analytical Terminology • Purple Book • Compendium of Polymer Terminology and Nomenclature • Silver Book • Compendium of Terminology and Nomenclature of Properties Clinical Laboratory Sciences • Green Book • Quantities, Units and Symbols in Physical Chemistry Plus many more terms defined as formal recommendations in Pure and Applied Chemistry

  3. Digital Chemical Terminology • > 7000 terms with authoritative definitions, spanning the whole range of chemistry – with DOIs • Source documents include IUPAC Color Books and recommendations published in Pure and Applied Chemistry • Currently undergoing stabilization and development to provide a foundation for future application https://goldbook.iupac.org

  4. https://dev.goldbook.iupac.org Rebooting the Gold Book Project undertaken by Stuart Chalk, University of North Florida to update the current Gold Book platform and provide a machine API.

  5. Exploiting the DataCite Metadata schema Key metadata items identified for an NMR experiment Can these be captured as DataCite metadata items? Based on work undertaken by Henry Rzepa, Imperial College UK and co-workers See https://doi.org/c3k6 for further details

  6. Exploiting the DataCite Metadata schema <subjects> <subject subjectScheme="inchi" schemeURI="http://www.inchi-trust.org/"> InChI=1S/C24H32NOSi.C7H4ClN2O4.C6H15N.H2O/c1-5-6-19-25-20-13-18-23(25)24(26-27(2,3)4,21-14-9-7-10-15-21)22-16-11-8-12-17-22;8-4-5-1-2-6(9(11)12)3-7(5)10(13)14;1-5(2)7-6(3)4;/h5-12,14-17,19,23H,13,18,20H2,1-4H3;1-4H;5-7H,1-4H3;1H2/t23-;;;/m0.../s1</subject> <subject subjectScheme="inchikey" schemeURI="http://www.inchi-trust.org/"> YAQIMGRXQRJPNV-AQUVTFJZSA-N</subject> <subject subjectScheme="NMR_nucleus" schemeURI="https://doi.org/10.14469/hpc/4739">1H</subject> <subject subjectScheme="NMR_experiment" schemeURI="https://doi.org/10.14469/hpc/4739">1D</subject> <subject subjectScheme="NMR_pulse-sequence" schemeURI="https://doi.org/10.14469/hpc/4739">zg30</subject> <subject subjectScheme="NMR_field" schemeURI="https://doi.org/10.14469/hpc/4739">800</subject> <subject subjectScheme="NMR_solvent" schemeURI="https://doi.org/10.14469/hpc/4739">CDCl3</subject> <subject subjectScheme="NMR_temperature" schemeURI="https://doi.org/10.14469/hpc/4739">298</subject> </subjects> Based on work undertaken by Henry Rzepa, Imperial College UK and co-workers

  7. Exploiting the DataCite Metadata schema <subjects> <subject subjectScheme="inchi" schemeURI="http://www.inchi-trust.org/"> InChI=1S/C24H32NOSi.C7H4ClN2O4.C6H15N.H2O/c1-5-6-19-25-20-13-18-23(25)24(26-27(2,3)4,21-14-9-7-10-15-21)22-16-11-8-12-17-22;8-4-5-1-2-6(9(11)12)3-7(5)10(13)14;1-5(2)7-6(3)4;/h5-12,14-17,19,23H,13,18,20H2,1-4H3;1-4H;5-7H,1-4H3;1H2/t23-;;;/m0.../s1</subject> <subject subjectScheme="inchikey" schemeURI="http://www.inchi-trust.org/"> YAQIMGRXQRJPNV-AQUVTFJZSA-N</subject> <subject subjectScheme="NMR_nucleus" schemeURI="https://doi.org/10.14469/hpc/4739">1H</subject> <subject subjectScheme="NMR_experiment" schemeURI="https://doi.org/10.14469/hpc/4739">1D</subject> <subject subjectScheme="NMR_pulse-sequence" schemeURI="https://doi.org/10.14469/hpc/4739">zg30</subject> <subject subjectScheme="NMR_field" schemeURI="https://doi.org/10.14469/hpc/4739">800</subject> <subject subjectScheme="NMR_solvent" schemeURI="https://doi.org/10.14469/hpc/4739">CDCl3</subject> <subject subjectScheme="NMR_temperature" schemeURI="https://doi.org/10.14469/hpc/4739">298</subject> </subjects> Based on work undertaken by Henry Rzepa, Imperial College UK and co-workers

  8. Exploiting the DataCite Metadata schema Search based on NMR subject fields Search based on InChI Based on work undertaken by Henry Rzepa, Imperial College UK and co-workers

  9. WORKSHOP GOALS: • Workflow: develop digital data publishing model across stakeholders • Guidelines:formulate consistent guidelines for publishing FAIR chemical data for common data types • Value Proposition: review re-use cases for chemical characterization data • Coalition: initiate process for ongoing coordination and stakeholder engagement Publishers • Databases Repositories Software Developers Researchers • Librarians Standards Organisations Data Initiatives

  10. Metadata Checklistfor Spectra InChI – Indicating the chemical substance being studied Instrument Metadata – What is data, what is metadata? What should be in the data file? A core set of fields have been identified. Bibliographic Metadata – Likely to be common across different data types and domains. Is there scope for a cross-domain effort to formulate FAIR metadata extensions and guidance for domain-specific data files?

  11. The NMReDATA Initiative nmredata.org aaa 2018 New machine-readable format: NMReDATA: combines structure data and NMR data into Sdfile like format. NMReDATA: Format based on SDfile – a defacto standard representation for chemical structures, extended to include NMR data and metadata Pupier, M, Nuzillard, J‐M, Wist, J, et al. Magn. Reson. Chem.2018,, 703– 715. DOI: 5610.1002/mrc.4737 A cross-community initiative in search of buy-in from instrument vendors and suppliers of software and infrastructure

  12. Crystallographic Data and Metadata Crystallographic Information File (CIF) • A standard format for archive and exchange of crystallographic data • Derived model • Processed data (structure factors) • Metadata about raw data (imgCIF) • (Meta)data items semantically defined by CIF dictionaries • Crystallisation details • Instrument details • Software packages and parameters • Quality metrics • Publication details • Instruments • Software • Publishers • Researchers

More Related