310 likes | 420 Views
Linking Data to Open Access Publications. EGU, 23 April 2012, Najla Rettberg, OpenAIRE , University of Göttingen,. In 12 Minutes …. OpenAIRE – P ublications and D ata Demonstrators for Enhanced Publications Use Case S cenarios Services for Users. OpenAIRE – Second Phase.
E N D
Linking Data to Open Access Publications EGU, 23 April 2012, Najla Rettberg, OpenAIRE, University of Göttingen,
In 12 Minutes…. • OpenAIRE– Publications andData • Demonstrators for Enhanced Publications • Use Case Scenarios • Services for Users EGU, April 23 2012
OpenAIRE – Second Phase • Open Access, participatory infrastructure for scientific information linkingpublications, datasets, funding • Disseminates OA/RDM information in Europe • Opens its content(search, browse, stats) and to 3rd-party/Service providers • Capitalizes on the OpenAIREinfrastructure, built for Open Access pilot, FP7-funded articles (measuring the impact of EC SC39) EGU, April 23 2012
Portal:Search, Access, Deposit EGU, April 23 2012
Past, present and OpenAIREplus OpenAIREplus OpenAIRE+ Guidelines for Data Providers Dataset repositories Metadata on data sets FP7 publications 5,600,000 OA publications 311 validated repositories OpenAIRE Guidelines v2.0 National funding publications OpenAIRE Guidelines v1.0 Driver Guidelines EC Project metadata National Project metadata Publication repositories network Institutional & Thematic EGU, April 23 2012
Covering ‘European Knowledge’ • Open Data Infrastructures • OA Publication Infrastructure ESFRi, EU wide infrastructures EGU, April 23 2012
A ‘Static‘ publication <Slide from Jens Klump
Enhanced Publications (EPs) Compound information objects: represent the aggregation of distinct information objects through meaningful relationships Example of SURF-EPs: textual publications enhanced with links to datasets OpenAIREplus provides EP services: • Management: creation and curation • Visualization, browsing, querying • Import: OAI-PMH/ORE harvesting of EPs from external providers • Export: OAI-PMH/ORE publishing of EPs, Linked Data representation EGU, April 23 2012
‘Information in Context’ EGU, April 23 2012
Cross-disciplineapproach • Attempt at a generic workflow • No one-size fits all for data • Use different data types, PIs, policies, access levels, standards • Look at research driven disciplines, different communities • Incremental, based on prototypes • “..any roadmap for OA infrastructure must address this natural tension between diversity and infrastructure” C. Meier zuVerl, & W. Horstmann (Eds.) 2011. Studies on Subject-Specific Requirements for Open Access Infrastructure. EGU, April 23 2012
Subject-specificpilots • Learning lessons from interoperation of data infrastructures • Interoperability pilots between OpenAIREplus and subject-specific infrastructures • In the Life Sciences • In the Social Sciences • Exploitation in modelling and implementation for OpenAIRE data model • Relationship entities: projects, publications, datasets EGU, April 23 2012
The Challenges • Aggregation andDiscovery ofresources • Representationof diverse disciplines in a ‚generic‘ infrastructure • Access restrictions/reusepolicies • User friendlywayfor Researchers to link researchresultswithprojectinformation • Machine-readable (Linked Open Data) EGU, April 23 2012
Twodisciplines… • SSH - DANS/EASY • ProducehandmadeEP‘satfilelevel • Experienceddatamodellingandresearchwork (Veteran tapes) • Life Sciences – EMBL-EBI • Text mineabstracts/fulltexts • Link bio-entitiestodatabase • Enrichedinformationcouldbetransferedtogenericinfrastructure EGU, April 23 2012
Demonstrator • Data model • Generalised • Extractcitationinfofordatasets • frome.gUniProtandfulltext • DerivePersistent Identifiers • from URLs (URNs and PMC-Ids) • Transfer oflinkedentities • communityservicesandOpenAIREinfrastructure EGU, April 23 2012
Use Cases • Import EP created in DANS or SURF • Proofof Services Interoperability EGU, April 23 2012
Use Cases • Import EP created in DANS or SURF • Proofof Services Interoperability • Manual compositionof EP in OpenAIRE • Proofof Tools: Editor, Discovery of Research data in OpenAIRE EGU, April 23 2012
Use Cases • Import EP created in DANS or SURF • Proofof Services Interoperability • Manual compositionof EP in OpenAIRE • Proofof Tools: Editor, Discovery of Research data in OpenAIRE • Automaticgenerationof EP byextractingcitationinformation (ormining), auto-linking • Proofthatrichmetadatacanberepresented in user-friendlyway • PossibleLinked Open Data compliancy EGU, April 23 2012
Use Cases • Reuse andenrichment: annotationsaddedbyuserstodatasetsorpublications • An EP isusedbyresearcher in publication • Adequatedocumentation • Test legal framework • Study into Licensing ofpublicationsanddata • Analyse requirementsof legal protectionofresearchdata • Legal prototype ofrestraints EGU, April 23 2012
Research Scenario 1 • Youare an EC-project researcher • OA publication • Dataset with a DOI • Generatethe link in OpenAIRE • Researcher completesdataoutputwithpaper • Nodatarepository • SubmitdatasettoOpenAIRE ‚orphan‘ repository EGU, April 23 2012
Research Scenario 2 • Yousearchfor ‚mousegenomeliterature‘ in OpenAIRE • Find a citationforpublication • fundingdetailsofproject • Relateddata, say a protein link toGenBank • Create yourown links tothis EGU, April 23 2012
Service activities • For publication providers - OpenAIRE’s Guidelines for repository managers • Metadata: (DC) and Protocols: (OAI etc.) • For data providers: accessing (metadata of) datasets from providers while minimizing effort to comply • Metadata: indications on minimal metadata about datasets (e.g., identifiers, date of creations, title, URLs) and best-practices for interlinking datasets and publications • Access protocols: no requirements for adopting precise protocols (e.g., OAI, FTP) or ID/URL frameworks (e.g., OpenURL, DOI) to comply EGU, April 23 2012
Service activitiesUsers • Registered end-users (e.g., EC personnel, project coordinators, researchers, authors) • Search, browse and access statistics • Deposit files and metadata of publications and datasets into the Orphan Repository • Ingest (claim) into the information space metadata • Create EP by combining datasets from different communities • Reuse ofdatasetsassecondarydata (withrespectto IPR) EGU, April 23 2012
Service activitiesUsers • Content provider managers (e.g. datasets and publications repository managers) • Registration and validation (OpenAIREPlus guidelines) of publication and dataset repositories • Data curators (administrative tasks) • Collect and aggregate publications, project data and dataset metadata • Third-party application developers • Bulk-fetch content from the (curated) information space EGU, April 23 2012
The Future….. • “Forget PDFs, imagine an ideal publication where you click on tables to get through to raw data, where you can contribute and discuss some aspects and later update or correct parts of a paper in subsequent versions. The latter is similar to Wikipedia, actually.” • PhD Student, UGOE EGU, April 23 2012
Danke…... • najla.rettberg@gwdg.de • @openaire_eu EGU, April 23 2012
Linking: Publicationto Database EGU, April 23 2012
AuthorsuppliedSupplementaryinfo: TIFF,MOV PLoS: O’Toole, Greenan, Lange, Srayko, Müller-Reichert EGU, April 23 2012
Research Impact • OpenAIRE puts foundations to measure research impact per publication, researcher, project, institution, country, … EGU, April 23 2012
Data Management Issues • Gooddatapractices • Data policies, standards • Drivers fordeposit? What‘s in itforresearchers? • Work withpublishers, DOIs • Where do researchersdepositdata? Figshare? EGU, April 23 2012
Potential issues: unstructured data with different kinds of media files • Persistent IDs: resolvable and managed by the originator of resource • Preservation: responsibility lies in the trusted repositories EGU, April 23 2012
Demonstrators • Demonstrators for Enhanced Publications • Explorehow links aremanagedbetweenpublicationsandresearchdata in Life Sciencesand SSH • Howdatacanbemutuallycomplementedandexchanged in genericinfrastructures • Example: how a publication ‚reported‘ in OpenAIREisenriched via UKPMC with links todatabases • Report: „Connection Data and Publications through e-Infrastructure“ EGU, April 23 2012