1 / 21

Sharing Research Data in Hong Kong (position paper)

This position paper discusses the importance of sharing research data, what to share, how to share, who to share with, and the starting point for data sharing. It also explores the Berlin Declaration and the Hong Kong data sharing policy for RGC funded data.

Download Presentation

Sharing Research Data in Hong Kong (position paper)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sharing Research Data in Hong Kong (position paper) Professor John Bacon-Shone Associate Director, Knowledge Exchange The University of Hong Kong Forum on Research Data Sharing June 28, 2010

  2. Key Starting Questions • Why share? • What to share? • How to share? • Who to share with? • What to use as the starting point?

  3. Why share? I believe the papers by our distinguished visitors have provided ample evidence that data sharing is a good idea, regardless of your disciplinary area, funding agency or location

  4. What to share? The starting point is clearly data funded by RGC/UGC, but arguably this should apply to all data collection/generation funded from public funds, including government data. While some private funders may require restrictions to protect intellectual property rights, others may be happy to share data for the greater common good, if encouraged and educated.

  5. How to share? This is a much more difficult question, as it depends on the nature of the data – how big is the dataset, are there confidentiality or intellectual property concerns, is there already a suitable mechanism in place for sharing data of this type?

  6. Who to share with? Potential audiences include: • Other academics • Government • NGOs • Private companies • Whole community. There is also the question of whether to restrict access to those who have registered their identity and agreed to follow certain rules, whether to require any payment/subscription and whether to apply geographic restrictions.

  7. What to use as the starting point? HKU, along with many European research and academic agencies have signed what is known as the Berlin declaration http://oa.mpg.de/openaccess-berlin/berlindeclaration.html which encourages open access to data and publications. NIH, the UK research councils, NSF all have policy documents for data sharing, which provide plenty of information about the details of a suitable data sharing policy that covers disciplinary differences.

  8. Berlin Declaration Establishing open access as a worthwhile procedure ideally requires the active commitment of each and every individual producer of scientific knowledge and holder of cultural heritage. Open access contributions include original scientific research results, raw data and metadata, source materials, digital representations of pictorial and graphical materials and scholarly multimedia material.

  9. Berlin Open Access (I) Open access contributions must satisfy two conditions: • The author(s) and right holder(s) of such contributions grant(s) to all users a free, irrevocable, worldwide, right of access to, and a license to copy, use, distribute, transmit and display the work publicly and to make and distribute derivative works, in any digital medium for any responsible purpose, subject to proper attribution of authorship (community standards, will continue to provide the mechanism for enforcement of proper attribution and responsible use of the published work, as they do now), as well as the right to make small numbers of printed copies for their personal use.

  10. Berlin Open Access (II) 2: A complete version of the work and all supplemental materials, including a copy of the permission as stated above, in an appropriate standard electronic format is deposited (and thus published) in at least one online repository using suitable technical standards (such as the Open Archive definitions) that is supported and maintained by an academic institution, scholarly society, government agency, or other well-established organization that seeks to enable open access, unrestricted distribution, inter operability, and long-term archiving.

  11. Hong Kong data sharing policy for RGC funded data • Timing of sharing • Acknowledgements • Access restrictions • Archiving requirements • Archiving Infrastructure needed • Previously collected datasets • Disciplinary differences • Funding

  12. Timing of sharing This needs to be a balance between the rights of the researcher to publish findings from the data they collect against ensuring that data is archived and shared in a timely manner. The policy should state specific deadlines as a requirement of a project being recorded as complete by RGC. While the sharing deadline might be delayed, the archiving one cannot or data will be lost.

  13. Acknowledgements All research findings based on the shared data should fully acknowledge the original collector and funder of the data collection.

  14. Access restrictions • The two key reasons for access restrictions are: • to protect confidentiality of participants in the data collection and • intellectual property concerns. • Confidentiality may require data users to sign a contractual obligation to protect confidentiality. • For data collected (as opposed to purchased) using public funds, there should be no restrictions on non-commercial use, but commercial use should be subject to license and the intellectual property rules of the funding agencies.

  15. Archiving requirements • While physical archiving is now cheap for most projects (with the notable exception of some projects that generate very large datasets), that does not mean that it is done properly for projects. • Proper archiving with all the necessary meta-information is essential to enable informed data sharing. • To ensure complete meta data on each data set, PIs must be involved, as only they know the exact nature and provenance of their collected data. • The physical location of the archive is not important, but ensuring that the data will be safely protected against technological change is essential.

  16. Archiving Infrastructure needed (I) • The physical location is not the issue, but instead unified access, indexing and archiving protections across locations are critical. As long as there is a Hong Kong search tool that provides unified access to where the data is stored, flexibility and long-term protections are what is important. • Some international projects already have the necessary infrastructure for data archiving at international locations and some local universities already have the physical infrastructure, but the software mechanisms for linking the meta information and controlling access as needed are not in place in Hong Kong.

  17. Archiving Infrastructure needed (II) • It is pointless to require or even encourage archiving without all the necessary infrastructure being put in place. • European social science data archives already have multi-lingual software infrastructure that could potentially be re-used in Hong Kong. • The university libraries could be tasked with identifying the physical infrastructure needed, while a multi-disciplinary taskforce could identify the best solution for the software infrastructure needed.

  18. Previously collected datasets In addition to setting a new policy that requires sharing for future datasets, it is essential to review past projects and identify valuable datasets for sharing so they are not lost forever.

  19. Disciplinary differences While there are some disciplinary differences in terms of the issues raised by data sharing, there is no need for policies that differ by discipline, but instead take advantage of our unified research grant council to agree policies that address all the different concerns.

  20. Funding • RGC has always agreed in principle to fund data archiving costs, however, absence of infrastructure and reluctance to fund third party archiving has meant no progress to date. Fortunately, the costs of the physical infrastructure have continued to decrease, which leaves the significant cost of software infrastructure to address meta information, access controls and indexing, which can be shared across institutions and disciplines. The costs are likely to be less than 1% of data collection costs, which is minimal relative to the benefits shown elsewhere. • UGC should work with government to address the issue of archiving government datasets to ensure that they are not lost, but are properly shared for the benefit of the whole community. This could be handled as a knowledge transfer initiative.

  21. Thanks!

More Related