1 / 18

Linking Data from ScienceDirect Articles

Linking Data from ScienceDirect Articles. Presented by: IJsbrand Jan Aalbersberg Hannover, DataCite Meeting Date: June 8, 2010. Linking to & from Data from & to ScienceDirect Articles. Presented by: IJsbrand Jan Aalbersberg Hannover, DataCite Meeting Date: June 8, 2010.

natan
Download Presentation

Linking Data from ScienceDirect Articles

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linking Data fromScienceDirect Articles Presented by: IJsbrand Jan Aalbersberg Hannover, DataCite Meeting Date: June 8, 2010

  2. Linking to & from Data from & to ScienceDirect Articles Presented by: IJsbrand Jan Aalbersberg Hannover, DataCite Meeting Date: June 8, 2010

  3. Linking Data in ScienceDirect • The Past • Supplementary data • Entity links to databases • The Present • Some considerations • PANGAEA-type linking • A Future • Getting even closer connected

  4. The Past (supplementary data) • Raw research data delivered as supplementary data • Available for limited number of data set types / formats • Data distributed over multiple articles and publishers • Format frozen in time – not maintained for preservation • Only available for smaller data sets (at most few 10 MBs) • Limited access due to use of existing publishing platforms • Data and article remain nicely coupled / packaged • Supplementary data always being peer-reviewed

  5. The Past (entity linking - manual) • Authors manually identify (and tag) entities that are mentioned in articles and of which associated data is present (or registered) in databases, like GenBank, MINT, Uniprot, PDB, CCDC, ... • Very accurate and unambiguous • However, requiring author effort • Publisher takes care of actual linking • Reciprocal linking usually taken care of

  6. The Past/Present (entity linking – automatic) • Sometimes automatically (e.g., NextBio and Reflect) • Easily extendable to new / other entities • Works retrospectively on older content • Does create recall / precision errors

  7. The Present (some considerations) • STM, “Brussels Declaration”, June 2006: • “... believe that, as a general principle, data sets, raw data outputs of research, and sets or subsets of that data should wherever possible be made freely accessible ...” • Data sets should be freely accessible – at publisher? • Scientists prefer independent data repositories • Need for single domain-specific coordination • Huge costs for maintenance and preservation • Proper deposit mechanism needed • Through publisher? Extra overhead vs. ease of use • Enforcing deposit prior to publication • If community-supported, surely a possibility • Data set standardization is needed for optimal use

  8. The Present (more considerations) • Scientist needs the combination of formal publication record and the raw data sets • To get optimal interoperability, close collaboration between publisher and data set repositories needed • Publisher should “enable and support” raw data sets • Submission: enforce if supported by community • Discoverability: interconnect article with data sets • Reciprocal linking at deepest level possible • PANGAEA-type linking • Data feeds from publisher to repositories? • Managing large amount of data set repositories? • DataCite as single discussion partner

  9. The Present (PANGAEA linking) • Author submits article to publisher • Author submits data set to repository • At article publication, repository links article DOI to associated data set DOI, creating actual connection • User sees link to ScienceDirect from PANGAEA • User sees link to PANGAEA from ScienceDirect: SD Article SD Server articles USER PANGAEA Server data + associations link

  10. PANGAEA links to ScienceDirect

  11. ScienceDirect links to PANGAEA

  12. A Future (tighter interoperability) • Not just a link to / from data and journal article • But provide integrated experience for scientist • Single page (environment) with data and article SD Article SD Server articles USER Supplementary Data Server data sets

  13. A Future (tighter interoperability) • Not just a link to / from data and journal article • But provide integrated experience for scientist • Single page (environment) with data and article • Some users prefer it other way around; so also offer: Data Set Data Set Server data sets USER Article Server articles

  14. A Future (inline supplementary data)

  15. A Future (inline supplementary data) • Structures submitted as supplementary data files (MOL files) • Displayed inline through Reaxys application / service

  16. Linking to & from Data from & to ScienceDirect Articles Presented by: IJsbrand Jan Aalbersberg Hannover, DataCite Meeting Date: June 8, 2010

  17. Creating the best User Experienceby integrating Data with Articles Presented by: IJsbrand Jan Aalbersberg Hannover, DataCite Meeting Date: June 8, 2010

  18. Creating the best User Experienceby integrating Data with Articlesrequires close collaboration between data set repositories and publishers Presented by: IJsbrand Jan Aalbersberg Hannover, DataCite Meeting Date: June 8, 2010

More Related