230 likes | 383 Views
Continuing Access to Research Data: The New Digital Curation Centre. Peter Burnhill Director (Phase One). Funded by:. An Overview. Personal Provenance Digital Curation Centre what it is (and is not) who’s involved Curating the Future We’re all Curators Now.
E N D
Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:
AnOverview • Personal Provenance • Digital Curation Centre • what it is (and is not) • who’s involved • Curating the Future • We’re all Curators Now
Discover information object of interest e.g. reference in an A&I databases or cited at the foot of an article Locate service on information object e.g. a service giving electronic access to the full text of the article, or one’s own library having the volume on a shelf nearby Request use of service via payment of money or (better still) privilege of membership: involves authorisation and authentication Access (service on) object of interest e.g. online access (and print-out), personal visit or document delivery MODELS workshops, UKOLN/JISC eLib Programme, 1994ish ‘Demand-side’ Verbs from Virtual Library
UK Digital Curation Centre • identified in Report commissioned by JISC Cttee for Support of Research (Lord & Macdonald, May 2003) • Twin drivers • Digital Preservation: ePublishing (DPC) & eLearning • Continuing Access: e-Science, ‘data deluge’ & Res Council policies • Call to set up DCC in JISC Circular 6/03, June 2003 • Ambitious & demanding remit, • Joint funding by JISC and e-Science Core Programme • Funding for outreach, services & development • Funding for research programme • Task entrusted to Consortium of four partners • award made Feb/March 2004
Overall Aim ‘continuing quality improvement in data curation & digital preservation’ Initial focus: data as evidential base for scholarly conclusions role of data archiving & preservation as keys to reproducibility and reuse Wider context & remit: worlds of scholarly communication & eLearning
Objectives • vibrant research programme • addressing the wider issues of digital curation • Collaborative Associates Network of Data Organisations • outreach for strong links across existing community of practice • engagement with curators (individuals & organisations) • service definition and delivery • to evaluate tools, methods, standards and policies • a repository of tools and technical information • ‘virtuous circle’ • expertise, experience & requirement feed into the DCC research programme
What the DCC is not ... … a national digital repository … an attempt to teach grandmothers to suck eggs … just another advisory service
DCC Consortium Partners Four Consortium partner institutions: University of Edinburgh - lead partner University of Glasgow (HATII) University of Bath (UKOLN) CCLRC (Rutherford and Daresbury Laboratories) Prior links via National eScience Centre (NeSC) jointly managed by Universities of Edinburgh & Glasgow
Some Names & Responsibilities Them with titles … Peter Burnhill, Director (Phase One) with Robin Rice, Phase One Project Co-ordinator EDINA & Data Library, University of Edinburgh Peter Buneman Research Director (& PI on EPSRC grant) Informatics, University of Edinburgh Liz Lyon, Associate Director (Community Support & Outreach) UKOLN, University of Bath Seamus Ross, Associate Director (Service Definition & Delivery) HATII, University of Glasgow David Giaretta, Associate Director (Development) CCLRC Two significant & well known ‘Ex Portfolio’ names Malcolm Atkinson, Director, NeSC Chris Rusbridge, Director, Information Services, UofGlasgow
What needs to be done • Respond to policy imperatives • twin aims:excellence in research & excellence in service • international respect & national leadership • meeting the needs of e-Science • impact now and into the future • manage complexity, risk and sustainability • Bridge across communities • universities & research institutes • scientific data tradition & document tradition • different disciplinary perspectives • engaging the information & computing sciences • Develop a collaborative model • Associates Network of Data Organisations
CMS-Bristol NASA NARA CNES ESA RLG BNSC BODC BADC NIEeS Cambridge Leicester Jodrell Bank DPC ESO RG RLG IVOA ESA SDSC Kyoto USC CDS ESO Council for Museums, Archives & Libraries Caltech JHU CSIRO RDN. OCLC International Collaborations Research Institutes RI EDG GridPP EGEE UNC So’ton MIMAS NLA CEH OAI NOF NCS ILRT HEIs & FE NEODC WT-CFG Leicester IC Maastricht Oxford AHDS Microsoft IBM Oracle BT STK Standards Bodies Durham Innogen Dutch NA Swiss NA Urbino Research Councils Data Archive Capri NTUA INRIA HUJ UPC Max- Planck LDC Salzburg NHS ACM Roslin INRIA MIMAS UNC JHU CSIRO IBM Almaden MRC HGU EBI OCLC TU Vienna IASSIST UPenn GSK NDCCCANDO CCLRC UKOLN DELOS DPC DLI (US) NeSC UofE UofG
developing the collaborative model curation organisations eg DPC communities of practice: users community support & outreach Collaborative Associates Network of Data Organisations management & co-ordination services research collaborators research development testbeds& tools Industry standards bodies
Digital Curation(1): Terminology actions needed to maintain and utilise digital data & research results over entire life-cycle for current and future generations of users. alongside which is Archiving appraisal & retention/disposal logical & physical integrity: authenticity/security and Digital Preservation long-run technological/legal accessibility & usability Data curation in science maintenance of body of trusted data to represent current state of knowledge in area of research.
Digital Curation (2) Digital Curation = Data Curation * Digital Preservation two organising themes: data as evidence archival responsibility mix of traditions and of activities shared concern for current and future scholarship what’s different about the digital, about data maintenance of body of trusted data
Digital Curation (3) Digital Curation = Data Curation * Digital Preservation Data Curation Data in use (huge, distributed, for long periods) Adding value (eg annotation) Combination and re-combination (provenance) Digital Preservation Future technological/legal accessibility & usability Significance of ‘designated community’ (OAIS)
Curation in action • Astronomy • Integrating and analysing distributed data (AstroGrid) • publishing multi-TB sky surveys (SuperCOSMOS & WFCAM) • interoperability standards (IVO Alliance) • BioInformatics • data publishing: generic tools for XML export (EBI Biomart) • annotation tools for massive data sets (Pubmed, VOTable) • archiving tools for dynamic data sets (biological DBs) • Environmental sciences • spatio-temporal annotation (OS Mastermap/ Mouse Atlas) • Document management • Tools for capture & normalisation (Xena) • Repository certification (RLG Task Force)
Digital Preservation Issues • Supporting ingest, management and dissemination • Registries: file formats, metadata, peripheral devices • Tracking and testing tools and standards • ingest, repository management, data exchange, ontologies, interoperability, metadata • Using OAIS as reference against which to test new models and architectures • Research topics • Repositories: repository models, registries • Long-term viability of metadata • Preservation strategies for emerging digital formats • Invest to Save, Report and recommendations of the NSF-DELOS Working Group on Digital Archiving and Preservation (2003) • http://delos-noe.iei.pi.cnr.it/
Research & Development • Research • Annotation, Data integration and publication • Appraisal and long-term preservation • Socio-economic & legal context • rights, responsibilities and viability • Performance and Optimisation • Development into Services • Standards & Testbeds • File Formats • Registry of Metadata Standards Further topics: • Evolution of structure, Ontologies, Emulation
Research Agenda • Aims evidence & curation as integrative activities • usability & automation • novel & visible research • deliverables/testbeds • Hot Topics • annotation & provenance • universal interest, wide subject, eg referencing • data publishing • metadata, Grid services, integration, security, optimisation • archiving and appraisal • process automation at ingest, curating change, scalability • socio-economic and legal • organisational dynamics, rights/responsibilities • Reach out & listen - virtuous circle
Development • Turns Research into ‘Products for Research’ that our communities can use with confidence • tracking and testing tools and standards • that are correct, usable, reliable, well documented e.g. for ingest, repository management, data exchange, ontologies • working with tool developers wherever possible • developing testbeds & interworking with other testbeds • aim to gain leverage formats • working with other projects worldwide • using generic tools and techniques • to develop strategies for emerging digital formats • Metadata standards • long-term viability of metadata • Registries underpin this work to provide basis of Advisory Service
Setting up the DCC Funding from the JISC began on 1 March 2004 EPSRC Research funding begins on 1 September 2004 expect to harvest ‘early crop’ from extant research Phase One Set-up from now until Launch of Centre in October 2004 face2face meetings: 20/21 March & 24/25 June drawing up programme of deliverables re-deploying & recruiting staff aim to have appointed full time director in time for Launch
Early ‘deliverables’ Website at www.dcc.ac.uk visit to learn of updates & progress especially of such ‘work in progress’ as draft of ‘DCC Approach to Digital Curation’ Dr David Giaretta, DCC Associate Director (CCLRC) launch of e-journal Dr Liz Lyon, DCC Associate Director (UKOLN) Digital Curation ‘Manual’ Dr Seamus Ross, DCC Associate Director (HATII) and Presentations like this to help build the Associates Network Leona Carpenter Helpdesk at digitalcuration@dcc.ac.uk contact us with offers of collaboration