170 likes | 268 Views
Dealing with software: the research data issues http://dx.doi.org/10.6084/m9.figshare.1150298 26 August 2014, Dealng with Data Conference Neil Chue Hong (@ npch ), Software Sustainability Institute ORCID: 0000-0002-8876-7606 | N.ChueHong@software.ac.uk. Project funding from. Supported by.
E N D
Dealing with software:the research data issueshttp://dx.doi.org/10.6084/m9.figshare.115029826August 2014,Dealng with Data ConferenceNeil Chue Hong (@npch), Software Sustainability InstituteORCID: 0000-0002-8876-7606 | N.ChueHong@software.ac.uk Project funding from Supported by Where indicatedslides licensed under
The Research Cycle Research Outputs Research is a continuous cycle. When we publish we are contributing to the body of knowledge. Interpret Data Test Revise Publish Paper Software Create
Research/Reuse/Reward Cycle Research Reuse Reuse is also a cycle. We build our research on the work of others. Reward mechanisms should encourage reuse. Interpret Index Test Revise Publish Identify Create Reward Cite
The current process Startresearch Writesoftware Usesoftware Produce results Publishresearchpaper Which mentions software and data Release data This process is simple but does not reward production orreuse of good software and data. It also has a long contribution cycle. Release software
Differing roles, different repositories backup sharing archiving Timescales Policy Licensing Ingest Metadata Assurance
Versioning • Why do we version? • To indicate a change • To allow sharing • To confer special status Version control systems make this easy and conceptof a person and an outputare there but not unique Public v1 Public v2 Public v3 Personal v3 Personal v3a Personal v1 Personal v2 Personal v2a Personal v2a
Granularity Function Algorithm Program Library / Suite / Package … • What do we define? • Useful units of reuse
Boundary • What do we choose to identify: • Workflow? • Software that runs workflow? • Software referenced by workflow? • Software dependencies? • What’s the minimum citable part?
Authorship Authorship • Which authors have had what impact on each version of the software? • Who had the largest contribution to the scientific results in a paper? • Can micro-attribution work? Can track author, but not contribution? • http://beyond-impact.org/?p=175 • Why do we identify? • To measure • To restrict • To communicate • To include OGSA-DAI projects statistics from Ohloh
Code as a Research Object • What if you could assign DOIs to code easily? • Could we make software more reusable? • http://mozillascience.org/code-as-a-research-object-a-new-project/ • https://guides.github.com/activities/citable-code/
A better process? Startresearch Writesoftware Adapt/extendsoftware Usesoftware Produce results Publishresearchpaper Identify existingsoftware Release software Release data Which references software and data papers Software and data papers are needed as proxies for rewarding reuse. But it enables a shorter contribution cycle for data and software. Publish software paper Publish data paper
One-click challenge • “One-click” archiving of a significant version of software in a code repository to a suitable institutional repository • “Suitable” repository: • Clear access / deposit / preservation policy • Adherence to standards • Ability to easily “transfer” in / out • Allows use of appropriate licenses for code • Sustainability of hosting organisation • Ability to monitor, check integrity • Provides permanent unique identifiers • Proposing a hackday to make this happen
Summary • Software is an important output of the research cycle, and should be rewarded • Repositories play an important role in the research cycle, including software • But software has specific issues with regards to research data management • Tooling is needed to lower barriers to deposit
Further information • This presentation: • Slides: http://dx.doi.org/10.6084/m9.figshare.1150298 • Abstract: http://dx.doi.org/10.6084/m9.figshare.1150299 • Where does it go from here: the place of software in digital repositories • http://www.research.ed.ac.uk/portal/en/publications/where-does-it-go-from-here-the-place-of-software-in-digital-repositories(ab6130c6-aee6-4972-9256-8ea0eb1862c9).html • Software Papers: improving the reusability and sustainability of scientific software • http://dx.doi.org/10.6084/m9.figshare.795303 • Software Sustainability Institute • http://www.software.ac.uk/ Supported by EPSRC Grant EP/H043160/1