150 likes | 248 Views
Small science research and the data sharing strata. ASIS&T Data Research Access and Preservation Summit -Legal and Social Implications of Shared Collections- Phoenix, AZ April 10, 2010 Melissa Cragin Center For Informatics Research in Science and Scholarship GSLIS – UIUC.
E N D
Small science research and the data sharing strata ASIS&T Data Research Access and Preservation Summit -Legal and Social Implications of Shared Collections- Phoenix, AZ April 10, 2010 Melissa Cragin Center For Informatics Research in Science and Scholarship GSLIS – UIUC
What are the social contexts under which research communities assemble to share and manage data? • Small Science • Investigating Data Communities • Data Curation Profiles • Implications for Data Management Systems
20-80Rule: The small are big! Heidorn, P. B. (2008).
Small science in flux • Traditional features • single PI (often) • often dependent on graduate students • ad hoc data management systems • idiosyncratic sharing practices • “success” dependent on using one’s own data • But… • may be producing all digital data • may be working at community level • may be conducting “data-driven” science • may be producing very large data sets • Reference data sets needed even in small, specialized data communities See, for example, Borgman, Wallis, & Enyedi (2007); Cragin et al. (in press)
Data Communities and Collections • institutional repository across fields • scholar created primary source collections • disciplinary resource • local, cross-departmental • geographically based, cross-disciplinary resource • national cultural heritage collection aggregation • national data cyberinfrastructure paradigm
Data Communities While the conceptualization of ‘data community’ is in a developmental phase now, we have identified at least 3 quite vividly, including 1.) sub-disciplines focused on particular kinds of data that support specific measurements or analysis; 2.) scientists focused on specific, focused research problem (this is often interdisciplinary in nature); 3.) researchers working to develop and use a shared, community-level data collection (i.e. “Resource Collection,” NSB, 2005)
Data Curation Profiles – Data Conservancy • The Data Curation Profile supports the ongoing development of system and policy • Assists in the facilitation of workflows and services for particular data types focal to use within the data community or sub-discipline • Provides crucial information; can serve as documentation for collection policy including selection, appraisal and retention guidelines • A template is currently in draft form and will be tested and revised in (2) stages of “use” tests: • Production • Application
Salient Features • Description of Data • Required Contextual Information • Applicable Standards • Links to formal, area-specific metadata, ontologies, etc. • Intellectual Property & Access “Rules” • Data owners • Terms of use • Attribution • Anticipated User Support • Re-appraisal Schedule • Data provenance • Version control • Format migration • Workflow for Ingest & Maintenance of this “kind” of data
Data Curation Profiles Project • research data management / metadata workflow • policies for archiving and access • system requirements for managing data in a repository • librarians roles and skill sets to support archiving and sharing Biochemistry Biology Civil Engineering Electrical Engineering Food Sciences Earth and Atmospheric Sciences Soil Science Anthropology Geology Plant Sciences Kinesiology Speech and Hearing Earth and Atmospheric Sciences Soil Science Purdue University Libraries, D. Scott Brandt, PI IMLS # LG-06-070032-07
Sharing Practices • Distinguishing private exchange from open sharing • exchange: sharing amongst collaborators is a primary concern, often with significant barriers • (more) open access: limited by need for control and reward system, but also • Sharing with wider “publics” is conditioned by both data management pressures and personal experience • the “known person – cost” algorithm • incidents of misuse • What is most easily or willingly shared is not always the data that has the most re-use value
Distinguishing Public from Private Sharing • “Supplying” data – functional privacy • targeted transfer of data to and from current collaborators or close colleagues • distribution on request • Evaluation • Data exposure – “publication” • general dissemination activity • makes data accessible to the wider public
Implications for concerns with misuse Misuse incidents experienced by the scientists in this study – • influenced their views on the appeal of data sharing • decreased their willingness to share • increased cynicism about data sharing initiatives • had a real impact on their behavior - misappropriation - lack of enforcement of institutional agreements
Thank You cragin@illinois.edu