370 likes | 496 Views
Transition or Transform? Repositioning the Library for the Petabyte Era Dr Liz Lyon, Director, UKOLN, University of Bath, UK Associate Director, UK Digital Curation Centre ARL / CNI Forum, Washington DC, October 2008. UKOLN is supported by:.
E N D
Transition or Transform? Repositioning the Library for the Petabyte Era Dr Liz Lyon, Director, UKOLN, University of Bath, UK Associate Director, UK Digital Curation Centre ARL / CNI Forum, Washington DC, October 2008 UKOLN is supported by: This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0
Perspectives • Disciplines and Practice : a diverse landscape • Institutions and Assets : emerging initiatives • People and Skills : building capacity Robot librarian @ CERN tends 5PB data (Nature July 2008)
Immersive case studies www.dcc.ac.uk/scarp/ • Disciplinary factors in curating Architectural Research (Colin Neilson) • Curating Brain Images in a Psychiatric Research Group (Angus Whyte) • Curating MST radar data at STFC (Esther Conway) • Roles and reusability of video data in social research (Angus Whyte)
Division of Psychiatry, Univ. of Edinburgh 9Tb MRI images + demographic data shared in Neurogrid/NeuroPsygrid and multi-centre studies 10 year longitudinal studies data iscontinually re-analysed: new analytic techniques add value to older data Data integration: multiple scanner image normalisation, shared terminology needed http://www.flickr.com/photos/30435752@N08/2892112112/ Neuro-imaging Case Study Report due October 2008 http://www.flickr.com/photos/macronin47/85006920/
Interdisciplinary team roles: clinicians, imaging researchers, psychologists, scanner engineers, sys-admin Heedful interaction: team weekly meetings provide a human infrastructure for curation, based on trust Data-sharing as a form of tradeor gift exchange: “give to get” rather than “give away” – implications for research funder data access policies: research funders should “think global & act local” Ethics, confidentiality, privacy issues: “skull-stripping” software & anonymisation, potential for prediction of psychiatric disorders DRAMBORA Risk assessment, mitigation steps identified; data policy + core metadata set, data documentation, phased development
eCrystals Curation & Preservation Study Working with the Digital Curation Centre Examined four main areas • Audit and certification (TRAC, DRAMBORA, NESTOR, ISO International repository audit and certification BOF Group) • The Open Archival Information System (OAIS) and Representation Information (RI) • eBank-UK application profile and preservation metadata • ePrints.org repository platform http://www.ukoln.ac.uk/projects/ebank-uk/curation/eBank3-WP4-Report%20(Revised).pdf Recommendations
eCrystals Federation: Preservation & sustainability Recommendations • Data repositories • Use DRAMBORA Interactive Vs 1.0 for self-assessment • Add PREMIS preservation metadata • Collect eCrystals representation information • Examine repository platform conformance to OAIS Reference Model • Survey partner preservation policies Digital Curation Centre partnership
Scaling Up Report Interviews & analysis of a discipline: crystallography Findings: Diverse lab practice Laboratory Information Management Systems (LIMS) & proprietary formats Data policy should reflect lab practice & institutional model Data quality criteria/validation “Prior publication” problem We need scalable assignment of “terms” for data discovery No discipline preservation model Recommendations (7), commentary May 2008 UKOLN and University of Southampton
Practice challenges? • Understanding the risks, awareness • Community consensus, advocacy • Data management plans • Appraisal: selection criteria • Data documentation: metadata, schema, RepInfo, semantics • Data formats: applying standards • Instrumentation: proprietary formats • Data provenance: authenticity • Data citation & versions: persistent IDs • Data validation and reproducibility • Data access: embargo policy • Data linking: text, images, software
***Open Science Experiment*** Blogging results data With thanks to Simon Coles, Univ Southampton
1. Transition or Transform? the Library • Remote research support or integrated team science? • Passive observation or proactive participation? • Is your library fully embedded in research practice? • How do you acquire a deeper understanding of disciplinary data curation approaches? • Models of engagement? • Immersive case studies • Joint R&D projects • New service offerings • Role extension (Faculty / subject / liaison librarians) • Secondments Library supports human infrastructure for curation
Institutions and Assets http://www.flickr.com/photos/mintchocicecream/7491707/
Shared Research Data Service Feasibility Study • HEFCE award £255K to SERCO • Objectives: • Develop understanding of UK’s current and future research data service needs • Work with other UK stakeholders to identify priorities for action • Develop a number of scenarios/options for the shared service from “do nothing” to a managed national service • Develop a detailed business plan for the preferred option(s) • Include assessment of costs and benefits in options appraisal • Indicate both scale of investment required & an estimate of likely ROI • Present outline governance and management proposals for the preferred option(s) • 4 case study “volunteers”: Bristol, Leeds, Leicester, Oxford • Report January 2009, Interim Report published July 2008
State-of-the Nation Analysis Research funder policies Data centres and facilities International comparators Options analysis and appraisal Baseline for Costs Stakeholder analysis, Success criteria Emerging survey themes: Advocacy, Co-ordination and information, Coherence, Data Depository, Skills and training, Seeding the Data Commons October 2008: Development of Models
http://www.flickr.com/photos/philipdunn/2424950499/ University of Oxford case study 37 interviews with researchers + Workshop Report published July 2008
Background A recommendation to JISC: “JISC should develop a Data Audit Framework to enable all universities and colleges to carry out an audit of departmental data collections, awareness, policies and practice for data curation and preservation” Liz Lyon, Dealing with Data: Roles, Rights, Responsibilities and Relationships, (2007)
Data Audit Framework Launch: 1st October 2008 http://www.data-audit.eu/ Benefits: Prioritisation of resources Capacity development and planning Efficiency savings – move data to more cost-effective storage Manage risks associated with data loss Realise value through improved access & re-use Positioned as a self-audit tool Scale: departments, institutions
Methodology http://www.data-audit.eu/DAF_Methodology.pdf
Detailed assessment • ID • Data creator(s) • Title • Description • Subject • Date • Purpose • Source • Updating frequency • Type • Format • Rights and restrictions • Usage frequency • Relation • Back-up and archiving policy • Management to date
School of GeoSciences pilot audit • 80 academics, 70 research fellows, 130 PhD students • Annual research grant and contract income of £4-6M • Staff contribute to >1 of five Research Groups • Involvement in inter-University Research Consortia and Research Centres • 15Tb data on main server • Audit led by Information Services staff • Interviews with 35 Faculty staff • Create Inventory of 25 datasets and classify them • Assess most significant assets in detail, collect basic set of data elements based on Dublin Core • Draft Report and Recommendations to the School of GeoSciences and to Information Services
GeoSciences pilot: lessons learned • Time needed is longer than initially anticipated (for interviews etc.) but still manageable - Plan well in advance • Inventory doesn't have to be comprehensive but could be a representative sample • Little documentation/knowledge of what exists:“a nightmare” • There are no standards in creating & managing data assets • Define the scope and granularity carefully • Ensure appropriate timing (avoid exams, field trips, Boards…) • Get support from senior management (VP level) • Collect as much information as possible in interviews/surveys • Variable openness of staff and their data
GeoSciences pilot: some outcomes • Preliminary but positive • Requirement for institution-wide data policy and guidelines • Requirement for researcher training • IPR issues associated with data ownership: individual or institution? • Requirement for training for auditors • Scaling up audits: 6 further data audits in process (including Physics, Biol Sci., Education, History, Classics & Archaeology, Biomedical Sciences)
2. Transition or Transform? Librarians There are lots of opportunities for action • Leadership by senior managers • Data policy development - with PVC / VP research • Storage infrastructure provision – with IT Director • Faculty audit co-ordination (DAF tool) • Advocacy, awareness-raising workshops, training • Data literacy programmes • Curation Lifecycle management • Inform data management plans • Data documentation best practice • Repository assessment (DRAMBORA tool) • Deliver new integrated support services
Background Recommendations to JISC: “A study is needed to examine the role and career development of data scientists and the associated supply of specialist data curation skills to the research community”. “JISC should fund a study to assess the value and potential of extending data handling curation and preservation skills within the undergraduate and postgraduate curriculum”. Liz Lyon, Dealing with Data: Roles, Rights, Responsibilities and Relationships, (2007)
“The role of the Library in data-intensive research is important and a strategic repositioning of the Library with respect to research support is now appropriate.” “there are…not enough specialised data librarians yet” “Recommendation: The research library community in the UK should work with universities and research institutes to define properly and to formalise the role of data librarians, and todevelop a curriculum that ensures a suitable supply of librarians skilled in data handling.”
CILIP Update June 2008 “Accidental” data librarians Only 5 in the UK???
Research Data Forum • Bringing diverse communities together • Data centre managers, IR managers, librarians, funders & policy makers • Aims & Objectives: • Facilitate co-operation between organisations and individuals • Exchange experience and best practice • November 2008, Manchester UK http://www.dcc.ac.uk/data-forum/ • 2nd joint DCC – RIN event
DCC Digital Curation 101 • Curation “summer” school • 6-10 October 2008 @ NeSC • Lectures + hands-on • Target participants: bench scientists, LIS professionals • Focussed around the DCC Curation Lifecycle Model • Conceptualise, Create and/or Receive, Appraise & select, Ingest & Store, Preserve, Access, Use & Re-Use, Data Management Plan
Open Science Open Science Open Science
3.Transition or Transform? the Role • Multidisciplinary teams, multidisciplinary people • Domain + ICT + library + archiving knowledge • New roles: “data librarians”, “data scientists” • Skills shortage: capacity building needed • What core data skills are required? • Not in LIS school curriculum? Radical change! • Recruit different people to the LIS team • Re-brand the LIS career From Librarianship to Informatics
Thank you.Slides will be available at :http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/presentations.html