400 likes | 551 Views
Knowledge Exchange CRIS-OAR interoperability project publication metadata. Knowledge Exchange is an international co-operative effort that supports the use and development of e-infrastructures for higher education and research. Partners are: Denmark’s Electronic Research Library (DEFF)
E N D
Knowledge Exchange CRIS-OAR interoperability projectpublication metadata
Knowledge Exchange is an international co-operative effort that supports the use and development of e-infrastructures for higher education and research. Partners are: • Denmark’s Electronic Research Library (DEFF) • German Research Foundation (DFG) • Joint Information Systems Committee (JISC) in UK • SURF foundation in the Netherlands
Motivation: Enable broad collaboration in the information management of research publications • Current Research Information Systems • a label for research management systems of various types, dealing with many aspects of research activities • contain metadata on research publications • Open Access Repositories • a label for for open research output archives aiming at preservation and dissemination of publications etc. • contain metadata on research publications • They share the challenge of achieving full metadata coverage for the publications within their scope
Motivation: Enable broad collaboration in the information management of research publications • If CRIS and OAR easily could exchange metadata about publications, they could support each other • But CRIS and OAR have grown out of different communities and have developed rather different approaches to publication metadata • If a university has a CRIS and an OAR, generally a publication must be registered twice to comply with both systems’ requirements • Both CRIS and OAR strive to be complete in their coverage of publications – both would benefit from collaboration – not to mention the authors/researchers.
Motivation: Enable broad collaboration in the information management of research publications • CRIS use a variety of formats – some use CERIF (or variants thereof) and some use various local or national formats • In many disciplines, publications are of global interest and are often results of international collaboration They are often of interest to more than one CRIS • CRIS with different formats would benefit from an easy and precise mechanism to exchange publication metadata
Motivation: Enable broad collaboration in the information management of research publications • OAR use a variety of formats – some use Dublin Core(or variants thereof), some use library formats such as MARC and MODS, and some use use various local or national formats • In many disciplines publications are of global interest and are often results of international collaboration They are often of interest to more than one OAR • OAR with different formats would benefit from an easy and precise mechanism to exchange publication metadata
Aim and purpose • To increase the metadata interoperability • between CRIS and OAR systems • and thus also • between CRIS and CRIS with different formats • between OAR and OAR with different formats • by defining and proposing • a metadata exchange format for publications • a set of common vocabularies for key elements
Project participants Project manager Project director
good Building new bridges in the old world Not designing new (and better) worlds This metadata island knows well what is doing - Good reasons govern its choice of format and vocabulary This metadata island knows well what is doing - Good reasons govern its choice of format and vocabulary
good Building new bridges in the old world Not designing new (and better) worlds We (simply) build a bridge that will enable these islands to communicate - without changing their language and life style. That will allow them to exchange publication metadata without studying and understanding the particularities of the other part. This metadata island knows well what is doing - Good reasons govern its choice of format and vocabulary This metadata island knows well what is doing - Good reasons govern its choice of format and vocabulary
Challenges stemming from different missions of formats • The different nature (and tasks) of • CRIS formats • Repository formats • The granularity challenge
The different nature of CRIS and repository formats Typical CRIS main entities and their relations (many triples & many detailed fields)
The different nature of CRIS and repository formats Simple Dublin Core 15 fields in a single flat structure Aimed at the description of some sort of“document” May be enhanced to provide more granularity and specificityBut mostly isn’t
Bridging publications metadata • CRIS formats are characterized by their • broader view on research information depicting research results as well as the actors and various environmental factors in their own right • (often) high level of detail and specificity in describing the various entities (very granular and precise) • ability to handle the dynamics of time – as everything else but research publications changes over time as well as their interrelations
Bridging publications metadata • OAR (DC) formats are characterized by their • Narrow view on depicting research results – generally publications • (mostly) low level of detail and specificity in describing the various aspects (less granular) • absence of need to handle the dynamics of time – as they deal with research publications tied to a specific point in time
Bridging publications metadata Implode the relational/network nature of the CRIS formats to a single structure – adequate for describing publications Design the field/element hierarchy so that highly granular as well less granular metadata may be represented – without loss of information
Project approach DRIVER DRIVER CERIF METIS DRIVER DC Metadata exchange format and vocabulary DRIVER ePrints default DDF-MXD DRIVER NARCIS MODS
Project approach 1. Analyze metadata practices of CRIS and OAR • Looking at formats in actual use at KE partners • Chart entities and granularities, similarities, differences
Project approach 2. Define entities/elements/attributes to be exchanged • Respecting differences in granularity • So that metadata may be exported without loss of information • So that the format may be used by very granular environments as well as less granular 3. Define/propose common exchange vocabulary • For the identified key concepts/entities 4. Define/propose common exchange syntax • Handle differences in granularity
Some potential use cases • CRISOAR • OARCRIS • CRISCRIS • OAROAR • CRIS/OAROpenAIRE (EU Open Access pilot) • PublisherCRIS/OAR • Subject repositoryCRIS/OAR (institutional)
Basic idea evolved • To carrie both the highest granularity (CRIS) and the lowest level (OAR?)
The DC elements are used as a baseline. • Title • Creator • Subject • Description • Publisher • Contributer • Date • Type • Format • Indentifier • Source • Language • Relation • Coverage • Rigths
Main entities of interest • The publication is in focus and other entities are in relation to the publication
Vocabularies • Person • Role • Description: role is the person role in relation to the publication. Terms: • Author • Primary Author • Corresponding Author • Editor • Publisher • Translator • Illustrator • Inventor • Supervisor
Publication types • Publication • Type • Description: the format does provide a gross list of publication types based on an analysis of the formats analysed in the project. A mapping between the different systems and formats in the analysis can be found on a web page. • Mapping between common vocabularies can be found at: http://weekschild.uci.ru.nl/KE/?select=all • The formats analysed: CERIF2008, MODS/DIDL, DRIVER_DC, DDF-MXD; EPrints, METIS, PURE
Publication types (terms) • Journal Letter • Journal comment • Journal review article • Journal book review • Book • Book chapter • Book preface • Conference paper • Conference abstract • Conference poster • Conference talk • Thesis Doctoral • Thesis PhD • Thesis Master • Working paper, preprint • Report • Report chapter • Lecture Notes • Lecture • Memorandum • Net publication • Patent • Software • Data set • Newspaper article • Radio/TV broadcast • Exhibition catalogue • Student report • Other
Vocabularies - Versions • Version • Description: This element and vocabulary is expressing the version of the document i.e. draft or published version of the document. The terms are based on the VERSIONS toolkit excluding the term “updated”. • Important! Different versions should be self contained and constitute individual records. This mirrors best-practices for repositories but not always the case for CRIS. Terms: • Draft i.e. working paper • Submitted i.e. pre print • Accepted i.e. post print • Published i.e. publisher edition • Updated i.e. reprint • VERSIONS project: http://www2.lse.ac.uk/library/versions/
The challenges for interoperability • Discussion!