E N D
Software Attributioncan we improve the reusability and sustainability of scientific software?http://dx.doi.org/10.6084/m9.figshare.942289NSF SI2 PIs Meeting, 24-25 February 2014Neil Chue Hong (@npch), Software Sustainability InstituteORCID: 0000-0002-8876-7606 | N.ChueHong@software.ac.uk Project funding from Supported by Unless otherwise indicatedslides licensed under
The Research Cycle Research Outputs Research is a continuous cycle. When we publish we are contributing to the body of knowledge. Interpret Data Test Revise Publish Paper Software Create http://dx.doi.org/10.6084/m9.figshare.942289
Research/Reuse/Reward Cycle Research Reuse Reuse is also a cycle. We build our research on the work of others. Reward mechanisms should encourage reuse. Interpret Index Test Revise Publish Identify Create Reward Cite http://dx.doi.org/10.6084/m9.figshare.942289
The current process Startresearch Writesoftware Usesoftware Produce results Publishresearchpaper Which mentions software and data Release data This process is simple but does not reward production orreuse of good software and data. It also has a long contribution cycle. Release software http://dx.doi.org/10.6084/m9.figshare.942289
A better process? Startresearch Writesoftware Adapt/extendsoftware Usesoftware Produce results Publishresearchpaper Identify existingsoftware Release software Release data Which references software and data papers Software and data papers are needed as proxies for rewarding reuse. But it enables a shorter contribution cycle for data and software. Publish software paper Publish data paper http://dx.doi.org/10.6084/m9.figshare.942289
Boundary • What do we choose to identify: • Workflow? • Software that runs workflow? • Software referenced by workflow? • Software dependencies? • What’s the minimum citable part?
Granularity Function Algorithm Program Library / Suite / Package …
Versioning • Why do we version? • To indicate a change • To allow sharing • To confer special status Public v1 Public v2 Public v3 Personal v3 Personal v3a Personal v1 Personal v2 Personal v2a Personal v2a
Authorship Authorship • Which authors have had what impact on each version of the software? • Who had the largest contribution to the scientific results in a paper? • http://beyond-impact.org/?p=175 OGSA-DAI projects statistics from Ohloh
Software Journals http://openresearchsoftware.metajnl.com http://dx.doi.org/10.6084/m9.figshare.942289
Peer review of software? • Can the aspects of peer review be decoupled? • Novelty and acceptability • Validity and quality • Accurate metadata helps sustainability • But excessive metadata requirements are a barrier • Essentially, for reuse and sustainability • Where is it? Who wrote it? How do I run it? • How do I find out more? • Software Papers: Improving the reusability and sustainability of scientific software • http://dx.doi.org/10.6084/m9.figshare.795303 http://dx.doi.org/10.6084/m9.figshare.942289
Implementation + Usage Introduction Screenshots Anatomy of a software meta-paper QualityControl Metadata References Reuse Metadata
F1000Research Web Tool Other journals you can publish software in:http://bit.ly/softwarejournals
Code as a Research Object • What if you could assign DOIs to code easily? • Could we make software more reusable? • http://mozillascience.org/code-as-a-research-object-a-new-project/ • https://github.com/mozillascience/code-research-object http://dx.doi.org/10.6084/m9.figshare.942289
I can get credit for everything Automatically generated from GitHub Repository Starring as a means of recommendation Forking analogous to citing for software … but not necessarily reward http://dx.doi.org/10.6084/m9.figshare.942289
Careers outside academic sector Career Paths in UK Non-university Research (industry,government etc.) UK STEM graduate career paths PhD students Early Career Research PermanentResearch Staff Professor Source: The Scientific Century, Royal Society, 2010 (revised to reflect first stage clarification from “What Do PhD’s Do?” study)
Where we are now • We must describe and cite software otherwise we cannot benefit from and reward reuse and refinement • Software papers are a citation mechanism that works with existing infrastructure and norms • Direct citation of code + metadata might be better • But we still need to fix the reward mechanism for non-traditional research outputs • And this is entirely in our hands as scientists http://dx.doi.org/10.6084/m9.figshare.942289
Further Information • Software Papers: Improving the reusability and sustainability of scientific software • http://dx.doi.org/10.6084/m9.figshare.795303 • Journals in which you can publish software: • http://bit.ly/softwarejournals • Journal of Open Research Software • http://openresearchsoftware.metajnl.com/ • Discussion: what is the minimum metadata required to describe a code object for scientific reuse? • https://github.com/mozillascience/code-research-object/issues • Contribute: Code as a research object: • https://github.com/mozillascience/code-research-object • The DOI for this presentation:10.6084/m9.figshare.942289 • The Software Sustainabilty Institute is a collaboration between universities of Edinburgh, Manchester, Oxford and Southampton. Supported by EPSRC Grant EP/H043160/1.