130 likes | 145 Views
This workshop focuses on associating chemistry metadata with DOIs and aims to develop guidelines for publishing FAIR chemical data. It will discuss the value proposition and coordinate stakeholder engagement for ongoing collaboration.
E N D
Chemistry Metadata Initiatives Metadata terms – IUPAC Terminologies Associating Chemistry Metadata with DOIs FAIR Chemistry Data Publishing Workshop (A Footnote on File Formats) April 2019
IUPAC Chemical Terminology • Blue Book • Nomenclature of Organic Chemistry • Red Book • Nomenclature of Inorganic Chemistry • White Book • Biochemical Nomenclature • Orange Book • Analytical Terminology • Purple Book • Compendium of Polymer Terminology and Nomenclature • Silver Book • Compendium of Terminology and Nomenclature of Properties Clinical Laboratory Sciences • Green Book • Quantities, Units and Symbols in Physical Chemistry Plus many more terms defined as formal recommendations in Pure and Applied Chemistry
Digital Chemical Terminology • > 7000 terms with authoritative definitions, spanning the whole range of chemistry – with DOIs • Source documents include IUPAC Color Books and recommendations published in Pure and Applied Chemistry • Currently undergoing stabilization and development to provide a foundation for future application https://goldbook.iupac.org
https://dev.goldbook.iupac.org Rebooting the Gold Book Project undertaken by Stuart Chalk, University of North Florida to update the current Gold Book platform and provide a machine API.
Exploiting the DataCite Metadata schema Key metadata items identified for an NMR experiment Can these be captured as DataCite metadata items? Based on work undertaken by Henry Rzepa, Imperial College UK and co-workers See https://doi.org/c3k6 for further details
Exploiting the DataCite Metadata schema <subjects> <subject subjectScheme="inchi" schemeURI="http://www.inchi-trust.org/"> InChI=1S/C24H32NOSi.C7H4ClN2O4.C6H15N.H2O/c1-5-6-19-25-20-13-18-23(25)24(26-27(2,3)4,21-14-9-7-10-15-21)22-16-11-8-12-17-22;8-4-5-1-2-6(9(11)12)3-7(5)10(13)14;1-5(2)7-6(3)4;/h5-12,14-17,19,23H,13,18,20H2,1-4H3;1-4H;5-7H,1-4H3;1H2/t23-;;;/m0.../s1</subject> <subject subjectScheme="inchikey" schemeURI="http://www.inchi-trust.org/"> YAQIMGRXQRJPNV-AQUVTFJZSA-N</subject> <subject subjectScheme="NMR_nucleus" schemeURI="https://doi.org/10.14469/hpc/4739">1H</subject> <subject subjectScheme="NMR_experiment" schemeURI="https://doi.org/10.14469/hpc/4739">1D</subject> <subject subjectScheme="NMR_pulse-sequence" schemeURI="https://doi.org/10.14469/hpc/4739">zg30</subject> <subject subjectScheme="NMR_field" schemeURI="https://doi.org/10.14469/hpc/4739">800</subject> <subject subjectScheme="NMR_solvent" schemeURI="https://doi.org/10.14469/hpc/4739">CDCl3</subject> <subject subjectScheme="NMR_temperature" schemeURI="https://doi.org/10.14469/hpc/4739">298</subject> </subjects> Based on work undertaken by Henry Rzepa, Imperial College UK and co-workers
Exploiting the DataCite Metadata schema <subjects> <subject subjectScheme="inchi" schemeURI="http://www.inchi-trust.org/"> InChI=1S/C24H32NOSi.C7H4ClN2O4.C6H15N.H2O/c1-5-6-19-25-20-13-18-23(25)24(26-27(2,3)4,21-14-9-7-10-15-21)22-16-11-8-12-17-22;8-4-5-1-2-6(9(11)12)3-7(5)10(13)14;1-5(2)7-6(3)4;/h5-12,14-17,19,23H,13,18,20H2,1-4H3;1-4H;5-7H,1-4H3;1H2/t23-;;;/m0.../s1</subject> <subject subjectScheme="inchikey" schemeURI="http://www.inchi-trust.org/"> YAQIMGRXQRJPNV-AQUVTFJZSA-N</subject> <subject subjectScheme="NMR_nucleus" schemeURI="https://doi.org/10.14469/hpc/4739">1H</subject> <subject subjectScheme="NMR_experiment" schemeURI="https://doi.org/10.14469/hpc/4739">1D</subject> <subject subjectScheme="NMR_pulse-sequence" schemeURI="https://doi.org/10.14469/hpc/4739">zg30</subject> <subject subjectScheme="NMR_field" schemeURI="https://doi.org/10.14469/hpc/4739">800</subject> <subject subjectScheme="NMR_solvent" schemeURI="https://doi.org/10.14469/hpc/4739">CDCl3</subject> <subject subjectScheme="NMR_temperature" schemeURI="https://doi.org/10.14469/hpc/4739">298</subject> </subjects> Based on work undertaken by Henry Rzepa, Imperial College UK and co-workers
Exploiting the DataCite Metadata schema Search based on NMR subject fields Search based on InChI Based on work undertaken by Henry Rzepa, Imperial College UK and co-workers
WORKSHOP GOALS: • Workflow: develop digital data publishing model across stakeholders • Guidelines:formulate consistent guidelines for publishing FAIR chemical data for common data types • Value Proposition: review re-use cases for chemical characterization data • Coalition: initiate process for ongoing coordination and stakeholder engagement Publishers • Databases Repositories Software Developers Researchers • Librarians Standards Organisations Data Initiatives
Metadata Checklistfor Spectra InChI – Indicating the chemical substance being studied Instrument Metadata – What is data, what is metadata? What should be in the data file? A core set of fields have been identified. Bibliographic Metadata – Likely to be common across different data types and domains. Is there scope for a cross-domain effort to formulate FAIR metadata extensions and guidance for domain-specific data files?
The NMReDATA Initiative nmredata.org aaa 2018 New machine-readable format: NMReDATA: combines structure data and NMR data into Sdfile like format. NMReDATA: Format based on SDfile – a defacto standard representation for chemical structures, extended to include NMR data and metadata Pupier, M, Nuzillard, J‐M, Wist, J, et al. Magn. Reson. Chem.2018,, 703– 715. DOI: 5610.1002/mrc.4737 A cross-community initiative in search of buy-in from instrument vendors and suppliers of software and infrastructure
Crystallographic Data and Metadata Crystallographic Information File (CIF) • A standard format for archive and exchange of crystallographic data • Derived model • Processed data (structure factors) • Metadata about raw data (imgCIF) • (Meta)data items semantically defined by CIF dictionaries • Crystallisation details • Instrument details • Software packages and parameters • Quality metrics • Publication details • Instruments • Software • Publishers • Researchers