430 likes | 450 Views
Explore the evolving realm of research data management with insights on data sharing practices, storage methods, and incentives for researchers to share data. Discover key findings and trends shaping the digital curation landscape.
E N D
Changing data landscape Sarah Jones Digital Curation Centre, Glasgow sarah.jones@glasgow.ac.uk Twitter: @sjDCC HKUST RDM services workshop, 19-20 March 2019, Hong Kong
What is the Digital Curation Centre? “a centre of expertise in digital information curation with a focus on building capacity, capability and skills for research data management across the UK's higher education research community” www.dcc.ac.uk
What is happening at the coal face? Image CC-SA-ND by Bill Dickinson www.flickr.com/photos/skynoir/8270436894
How many researchers make data open? 79% of researchers have made data openly available The State of Open Data 2017 Digital Science 2300 respondents worldwide only 1 in 10 provides their research data as open data for the public Researchers and their data (2015) eInfrastructures Austria 3026 Austrian respondents 29% 68% of researchers already share data or expect to do so in future Jisc DAF studies (2016) 1185 UK respondents 64% agree that they are willing to share their data Open Data: the researcher perspective (2017), Elsevier 1162 respondents worldwide 32% 18% 21%
How do researchers share data? Less than 15% publish data in a repository. Elsevier: Open Data - the researcher perspective “When asked where they have published data, most commonly respondents had done so as an appendix to an article (just over 30%) with a data repository close behind (just under 30%) and 20% having published in a data journal.” Digital Science: The State of Open Data Over half only allow access on request. 54% share data by using external storage devices or email. eInfra Austria: Researchers and their data Of 13 methods stated, top 4 options for currently sharing data were: • Emailing data files (65%) • Cloud service e.g. Dropbox, Googledrive (59%) • Portable storage (35%) • Supplementary data (20%) Formal repository (public / institutional) c.12% Jisc DAF studies
Opinions on sharing “While many researchers are positive about sharing data in principle, they are almost universally reluctant in practice. ..... using these data to publish results before anyone else is the primary way of gaining prestige in nearly all disciplines.” “Data sharing was more readily discussed by early career researchers.” Incremental project http://eprints.gla.ac.uk/54623/3/54623.pdf
Why do researchers share data? “For more than half of the researchers, the most attractive incentives for sharing their data were increased visibility and impact, new cooperation opportunities, recognition in professional circles, as well as their contributions being regarded as scientific output.” eInfra Austria: researchers and their data Digital Science: The State of Open Data Jisc DAF studies
Data storage and loss 17% of respondents had lost data More than one-third had experienced data loss. Strong preference to store on business/private computer, external hard drive & usb eInfra Austria: researchers and their data 36% had experienced loss and 83% of this was due to physical storage media Digital Science: The State of Open Data Jisc DAF studies
Sharing of microarray data • Increase from c.5-35% in under a decade • Best-practice guidelines for sharing microarray data are fairly mature • Two centralized databases have emerged • Unusually strong data sharing requirements in some journals Piwowar, H. (2011) Who Shares? Who Doesn't? Factors Associated with Openly Archiving Raw Research Data. PLOS One https://doi.org/10.1371/journal.pone.0018657
Awareness of OS & initiatives European Commission (OSPP) Open Science Policy Platform. (2017) Providing researchers with the skills and competencies they need to practise Open Science. Report of the Working Group on Education and Skills under Open Science, doi: 10.2777/121253
Greater impact Mandates Better science € OS advocacy How does this even help me or my career? Deadlines paperworkPRESSURE Too big to email… Dropbox? Not enough storage RDM issues
Respondents mentioned 40 terms which were unclear to them in European Commission DMP Language is a barrier… “Researchers are not familiar with the following terms/phrases : Metadata, standards for metadata/data, ontologies, mapping with ontologies, interoperability, ... . All the ICT jargon” “With the help from Swedish National Data Service we could clarify many questions. Without this help we would not be able to finish the DMP.” Grootveld et al. (2018). OpenAIRE and FAIR Data Expert Group survey about Horizon 2020 template for Data Management Plans http://doi.org/10.5281/zenodo.1120245
Drivers for change Image CC-BY-SA-ND by David D Wang https://www.flickr.com/photos/30326117@N08/3475108362
Why are research data important to unis? “If an institution spent A$10 million on data, what would be the return? The answer is: more publications; an increased citation count; more grants; greater profile; and more collaboration.” Dr Ross Wilkinson, ANDS www.ariadne.ac.uk/issue72/oar-2013-rpt
Research data: institutional crown jewels? http://www.flickr.com/photos/lifes__too_short__to__drink__cheap__wine/4754234186
Data driven discovery Citizen science projects & public engagement Old weather project models climate change: Data for research, not from research
Sharing leads to breakthroughs “It was unbelievable. Its not science the way most of us have practiced in our careers. But we all realised that we would never get biomarkers unless all of us parked our egos and intellectual property noses outside the door and agreed that all of our data would be public immediately.” Dr John Trojanowski, University of Pennsylvania www.nytimes.com/2010/08/13/health/research /13alzheimer.html?pagewanted=all&_r=0 • ...and increases the speed of discovery
Validation of results “It was a mistake in a spreadsheet that could have been easily overlooked: a few rows left out of an equation to average the values in a column. The spreadsheet was used to draw the conclusion of an influential 2010 economics paper: that public debt of more than 90% of GDP slows down growth. This conclusion was later cited by the International Monetary Fund and the UK Treasury to justify programmes of austerity that have arguably led to riots, poverty and lost jobs.” www.guardian.co.uk/politics/2013/apr/18/uncovered-error-george-osborne-austerity
Cut down on academic fraud • Stapel – 55 publications – “fictitious data” www.nature.com/news/2011/111101/full/479015a.html
Benefits for you: sharing data increases citations! • Want evidence? • Piwowar, Vision – 9% (microarray data) • Drachen, Dorch, et al – 25-40%, astronomy • Gleditch, et al – doubling to trebling (international relations) • Open Data Citation Advantage • http://sparceurope.org/open-data-citation-advantage
Increased use and economic benefit The case of NASA Landsat satellite imagery of the Earth’s surface: Up to 2008 Since 2009 Freely available over the internet Google Earth now uses the images Transmission of 2,100,000 scenes per year. Estimated to have created value for the environmental management industry of $935 million, with direct benefit of more than $100 million per year to the US economy Has stimulated the development of applications from a large number of companies worldwide • Sold through the US Geological Survey for US$600 per scene • Sales of 19,000 scenes per year • Annual revenue of $11.4 million http://earthobservatory.nasa.gov/IOTD/view.php?id=83394&src=ve
Research data policy changes Image CC-BY-NC-SA by Tom Magllery www.flickr.com/photos/lwr/13442910354
Data policy trends • Proliferation of policies • Make the landscape easier for researchers to navigate • More harmonisation needed • Clarifications needed when requirements conflict • Emphasis on data sharing more than RDM. Increasingly ‘open’ and ‘FAIR’ rhetoric • Research data policies often ‘aspirational’ and high-level • Need for more group guidelines and practical procedures • More researcher input when developing services & infrastructure
Move towards openness Slide from Giulia Ajmone Marsan, Directorate for Science, Technology and Innovation, OECD
Science as an open enterprise “Much of the remarkable growth of scientific understanding in recent centuries is due to open practices; open communication and deliberation sit at the heart of scientific practice.” Royal Society report calls for ‘intelligent openness’ whereby data are accessible, intelligible, assessable and usable. https://royalsociety.org/policy/projects/science-public-enterprise/Report
G8UK - Endorses OA Open Data Charter Policy Paper 18 June 2013 “To the greatest extent and with the fewest constraints possible publicly funded scientific research data should be open, while at the same time respecting concerns in relation to privacy, safety, security and commercial interests, whilst acknowledging the legitimate concerns of private partners.” G8 Science Ministers Statement- (June 2013)
Science Europe policy harmonisation • Voluntary alignment of RDM policies among funders in Europe • Core Requirements for DMPs • Criteria to select Trusted Repositories • Published a framework to support research communities in setting up protocols for the collection and management of data within specified disciplinary domains • Hope Domain Data Protocols (DDPs) will become ‘DMP template’ for a given domain https://www.scienceeurope.org/policy/policy-areas/research-data
Ultimately funders expect: • Data management plans • timely release of data • once patents are filed or on (acceptance for) publication • open data sharing • minimal or no restrictions if possible • FAIR data • documented and reusable • preservation of data • typically 5-10+ years if of long-term value
Increasing harmonisation & coordination Image CC-BY-SA-ND by Fabrice Denis Photography https://www.flickr.com/photos/fabricedenisphotography/36062765374
Global Open Science Commons data commons
European Open Science Cloud • Key Messages http://doi.org/10.2777/1524
Proposed EOSC governance structure Advise on the implementation Steerthe implementation Contribute to the implementation EU-funded projects Governance Board Stakeholders Forum Nationally-funded projects and initiatives MS/AC delegates and the European Commission Users, Service Providers, Public sector, Industry, SMEs, etc. ReviewsEndorses Orients Other projects and initiatives ProposesMonitors Reports Extended Coalition of Doers Working Groups Interact Executive Board WG WG WG European stakeholder organisations and individual experts WG Supports Supports Supports/coordinates w. EOSCSecretariat.euCoordination and Support Action
What is the RDA? International member based organisation with more than 7,900 members globally representing 137 countries RDA is building the social and technical bridges that enable open sharing of data Vision: researchers and innovators openly sharing data across technologies, disciplines, and countries to address the grand challenges of society
The university dimension Image CC-BY-SA by Dawn Manser www.flickr.com/photos/dawnmanser/3532598208
Annual RDM survey issued by DCC Income range percentiles - split into 3 groups across all 161 HEIs • 60 UK Higher Education Institutions responded to DCC survey 2015, of 132 invited • Research-active institutions well represented 77% Research income % of total 20% 3% Percentiles Briefing and links to data: http://www.dcc.ac.uk/survey2015
Who has what in place? Policy and strategy Business planning 87% 13% 50% Data Mgmt Planning * 38% 40% Data cataloguing Managing active data 18% 22% Data preservation Governing access & reuse 63% Skills training & consultancy % indicating ‘rolling out’ or ‘embedding’ * referred to ‘access & storage systems’ in survey
Components of RDM services www.dcc.ac.uk/resources/how-guides/how-develop-rdm-services
Thanks for listening! DCC guidance, tools and case studies: www.dcc.ac.uk/resources Follow us on twitter: @digitalcuration and #ukdcc