220 likes | 321 Views
Introduction to DataCite. Adam Farquhar, PhD Head of Digital Library Technology, The British Library President, DataCite June, 2010. The British Library. Exists for everyone who wants to do research – for academic, personal, and commercial purposes.
E N D
Introduction to DataCite Adam Farquhar, PhD Head of Digital Library Technology, The British LibraryPresident, DataCite June, 2010
The British Library • Exists for everyone who wants to do research – for academic, personal, and commercial purposes. • Covers all subject areas – sciences, technology, medicine, arts, humanities, social sciences… • Receives a copy of every item published in the UK. • Holds over 150 million items, with 3 million items added each year. • Used by over 16,000 people each day (on site and online).
Data and the Digital Landscape • Seismic measurements taken by a geologist. • Genetic data collected by a medical researcher. • A survey of public opinions collected by a sociologist.
Data: The Foundation of Research • Data is a crucial component of the scholarly record • Re-acquisition may be impossible • Datasets are essential to the British Library’s mission to advance the World’s knowledge
No effective way to link between datasets and articles No widely used method to identify datasets No widely used method to cite datasets Widening Gap Articles Underlying data
Datasets are Difficult to discover Difficult to access Being lost As a result…
Data is difficult to manage after project funding ceases Informal networks provide the primary means of sharing Only 21% use a national or international facility Datasets are not included in impact analysis Good luck finding it or getting permission to use it (your discipline may vary) Source: UKRDS Study Datasets – First Class Citizens?
DataCite – An Award Winning Global Consortium • DataCite aims to: • Establish easier access to scientific research data • Increase acceptance of research data • Support archiving of data for verification and re-use
DataCite – Supporting the Research Community DataCite: • Supports researchers by enabling them to locate, identify, and cite research datasets with confidence • Supports data centres by providing persistent identifiers for datasets, workflows and standards for data publication • Supports publishers by enabling research articles to be linked to the underlying data
DataCite uses DOIs for Data:DataCite : Data Centres :: CrossRef : Publishers • URLs are not persistent • (e.g. Wren JD: URL decay in MEDLINE- a 4-year follow-up study. Bioinformatics. 2008, Jun 1;24(11):1381-5). Digital Object Identifiers (DOIs) offer a solution • Mostly widely used identifier for scientific articles • Researchers, authors, publishers know how to use them • Put datasets on the same playing field as articles • Dataset • Yancheva et al (2007). Analyses on sediment of Lake Maar. PANGAEA. • doi:10.1594/PANGAEA.587840
Membership From Canada to Australia Currently twelve members across nine countries Over 800,000 records registered with DOI names so far
TIB begins to issue DOIs for datasets Rapid Progress Builds on Foundational Work 12. 10 03. 09 12. 09 06. 10 05 • Paris Memorandum • Production services with Data Centres • Shared technical infrastructure • Integrated services with key partners • DataCite Association founded in London • 7 members • 12 members • All members assigned DOIs • Over 800,000 items registered • Pilot projects with Data Centres
DataCite – Roles and Responsibilities • The DataCite registration agency • Maintains the resolution infrastructure • Maintains a searchable database of metadata • Manages identifiers over the long term • Establishes and shares best practice • Publishing agents (data centres, research institutes, publishers) are responsible for • Quality assurance • Content storage and access • Creating the identifier • Creating and updating metadata
Member Institution Member Institution Data Centre Data Centre Data Centre Data Centre Data Centre Data Centre DataCite Structure International DOI Foundation Global Handle System Member DataCite Carries AssociateStakeholder Works with …
Strengths and Weaknesses of DOI • DOIs have some strong advantages • Accepted by researchers and scientists • Mature infrastructure • Put datasets on the same playing field as articles • But perceived as • Expensive • The current IDF business model favours larger registration agencies • Publisher oriented • The largest registration agency is the publisher-oriented CrossRef
The Cost of Visibility DOI AssignmentManagement • €0.01 – €1 • €50 – €500 (approx 1% of data creation cost) • Storage • Quality Assurance • Metadata Collection Production • €5,000 – €5,000,000
Rapidly Growing Ecosystem • Microsoft works with CDL to embed DataCite into Excel plug-in • UK National Sound Archive assigns DataCite DOIs to archival recordings • Dryad integrates DataCite DOIs into publisher workflows for supplementary material and datasets in US • ANDS integrates DataCite DOIs into dataset services • Thieme Publishing Group uses DataCite DOIs to link articles and primary research data (at FIZ) • Active discussions with key research information service providers and data centres
Require clear unambiguous citations for datasets Integrate links to datasets into delivery platforms Integrate into workflows for researchers, data centres, and publishers Collaborate to understand roles and responsibilities among publishers, data centres, and libraries Improve attribution and credit for data producers Roll out services DataCite supports researchers by enabling them to locate, identify, and cite research datasets with confidence We welcome your comments, questions, and ideas! Contact:www.datacite.org adam.farquhar {@} bl.ukjan.brase {@} tib.uni-hannover.de What Next?