1 / 21

PANGAEA Archiving and Publication of Scholarly Data for the Long Tail of Science

PANGAEA Archiving and Publication of Scholarly Data for the Long Tail of Science . Michael Diepenbroek. What is PANGAEA?. Information system for long -term archiving and publication of data from earth & environmental sciences ( since 1993)

cais
Download Presentation

PANGAEA Archiving and Publication of Scholarly Data for the Long Tail of Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PANGAEAArchiving and Publication of Scholarly Data for the Long Tail of Science Michael Diepenbroek

  2. Whatis PANGAEA? • Information systemforlong-term archivingandpublicationofdatafromearth & environmental sciences(since 1993) • Accreditedbythe „World Meteorological Organisation“ (WMO) as„World Radiation Monitoring Center“ (WRMC)(since 2007) • Accreditedbythe „International Council for Science“ (ICSU) as World Data Center„Publisher for Earth & Environmental Science“ (World Data Center) (since 2001)

  3. PANGAEA - contents • Integral partofscience • More than 160 European to international projectssince 1995 (www.pangaea.de/projects) • highlyheterogenous &dynamic • multidisciplinary Total numberofdatasets ~350.000 Data volume <2 PB Increase ~5% per year

  4. PANGAEA - technicalarchitecture Harddisk + tape (silo) Sybase IQ Sybase ASE Editorial System RDB warehouse Curators Webserver Middleware Ticket System PANGAEAsearchengine IQ interface Various services Users

  5. PANGAEA - interoperability Portals • CARBOOCEAN • EUR-OCEANS • IODP - SEDIS • ICSU WDS portal • ESONET/EMSO Broker function • GBIF, OBIS Sensor webs • ESONET/EMSO, Statoil Conformto global standards • ISO19xxx, OGC, W3C, OAI

  6. PANGAEA – interoperability data management & longterm archiving Frontends / portals catalogues catalogues protocols WS(SOAP/WSDL) Elsevier,Scopus … marshaller PANGAEAweb frontend Index PANGAEA Geoserver(OGC) gml, kml INSPIRE XSLT GEOSS OGC CSW IODP RDB ISO19115 ICSU WDS harvester ISO19115 Thomson Reuters Dublin Core EUR-OCEANS DIF harvester OAI-PMH CARBOOCEAN DIF PubMed Dublin Core OpenAire harvester Darwin Core OCLC DIGIR Darwin Core Google harvester STD-DOI DOI registration OBIS WS(SOAP/WSDL) ISO690 GBIF DOI registry DataCite

  7. PANGAEA– Dissemination of Data & Metadata

  8. The Long Tailof Data Professionallymanaged & publisheddata Large scalemonitoring & computeddata & disciplinarydatacenters Unmanaged open accessdata Unmanaged & non-publicdata Data from individual scientists, labs, orsmallerprojects Fitness ofuse Total volumeofscientificdata

  9. Publishing data with PANGAEA • Citable & persistent (DOI) • CC-BY License • Quality data • QA/QC -> reviewprocedures • Efficientusage • (Meta)data & interoperabilitystandards(mashinereadable) • FITNESS OF USE! XLSX TXT DOC XML Data Set Data Set NetCDF PDF Data Set Data Set GRIB CSV Data Set Data Set … XLS Data Set Data Set … Data Set • OECD principlesandguidelinesforaccesstoresearchdata (2007)

  10. Data publication- citability time Article Data Article Data Data Article Data Article Data

  11. Publishing workflow - synchronized

  12. Impact on citationrates 35% to 69% more citations! Piwowar HA, Day RS, Fridsma DB (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. doi:10.1371/journal.pone.0000308 courtesy of Jon Sears (AGU)

  13. Collaborationbetweendatacenters & sciencejournals • linking editorial workflows • linking services

  14. Data Publishing – Cross-referencing

  15. Data Publishing – Cross-referencing

  16. Linking infrastructure Data archive Catalogues Data archive Bibliometrics Data archive Publishers Data archive Data archive …

  17. ICSU WDS perspective Web of Knowledge Google Scholar Scopus Catalogues Certified Data Archives Registries Journals Bibliometric Services ICSU WDS Crossref DataCite ORCID CrossData Thomson Reuters Citation Indexes

  18. WDS Certification & accreditation • Trustworthinessof WDS data holders and service providers • Evaluation criteria: based on a compilation of international standards and best practices • Certification authority: WDS Scientific Committee 2014/03: 75 members

  19. WDS/RDA WGs and IGs • Publishing workflows • Publishing Services • Incentives (Bibliometrics) • Trustedrepositories & services • Costcompensationmodels e-Infrastructures Fitness ofuse Scientific research projects Total volumeofscientificdata

  20. Someconclusions • Publishing datagivesbenefittoprovidersandhassignificantimpact on dataquality. • „Fitness ofuse“ is an importantaspectofdataqualityand a prerequisiteforintegratingdatafrom different sources. • Certificationiskeyfortheevaluationofthequalityofservicesanddata. • Scalableservicesareneededtoembeddatapublicationsintothecurrentscholarlypublishingsystem

More Related