250 likes | 265 Views
This work is licensed under a Creative Commons License Attribution-ShareAlike 2.0. UK Digital Curation Centre An Introduction. Dr Liz Lyon, Associate Director Outreach IACMST MED Forum, November 2005. Funded by:. Repositories and digital curation.
E N D
This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0 UK Digital Curation Centre An Introduction Dr Liz Lyon, Associate Director Outreach IACMST MED Forum, November 2005 Funded by:
Repositories and digital curation For later use? In use now (and the future)? Static Dynamic Data preservation Data curation “maintaining and adding value to a trusted body of digital information for current and future use”
Assuring permanent access to the records of science & the humanities? • Long term access to primary data • Increasing data volumes from eScience and Grid-enabled / cyberinfrastructure applications • Changing research paradigm: data-driven science, “big science” • Observational data, simulations, large-scale experimentation • Multi-media resources, statistical data, surveys, geo-spatial data……
Facilitate “post-processing” and knowledge extraction • Enable the acquisition of newly-derived information and knowledge • Run complex algorithms over primary datasets • Mining (data, text, structures) • Modelling (economic, climate, mathematical, biological) • Analysis (statistical, lexical, pattern matching, gene) • Presentation (visualisation, rendering)
Provide additional functionality beyond digital preservation processes: adding value • Annotations • Gene and protein sequences • e-Lab books (Smart Tea Project in chemistry)
Presentation services: subject, media-specific, data, commercial portals Searching , harvesting, embedding Resource discovery, linking, embedding Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media The scholarly knowledge cycle : linking research data to publications eBank UK Project http://www.ukoln.ac.uk/projects/ebank-uk/ Data analysis, transformation, mining, modelling Aggregator services: national, commercial Harvestingmetadata Research & e-Science workflows Repositories : institutional, e-prints, subject, data, learning objects Deposit / self-archiving Validation Validation Publication Linking Emerging policy on open access to data Data curation: databases & databanks Peer-reviewed publications: journals, conference proceedings
Issues: generic data models, metadata schema & terminology • Validation against other schema • Complex digital objects and packaging options • METS • MPEG 21 DIDL • Terminologies • Domain: marine? • Inter-disciplinary e.g. wider environment, bio…. • Metadata and vocabularies • Meaningful resource discovery?
Ontologies for discovery in an interdisciplinary world • Transform the ‘list’ into an ‘ontology’ • Embed ontology into the deposition process • Aggregators use keywords for linking with the broader literature • Researchers use keyword ontology in search and discovery services • Formal vs informal “folksonomies” • Web 2.0???
Issues: Persistent identifiers for data (image) citation • Use cases: depositor, author, service provider, reader, publisher, ? • Schemes: DOI, Handle, ARK, PURL • Global identification: express as http URIs • Added value services: CrossRef, resolution service, integration (Globus), look-up service, ? • Degree of trust or persistence • Costs • Future potential: political, ? • Domain identifiers: e.g. International Chemical Identifier (InChI) codes
Issues: Integration into (marine) research workflows • R4L Repository for the Laboratory Project (JISC-funded) automated data capture from instrumentation, registration of results • SMART TEA electronic Laboratory notebook + annotations • Publishers?? • Research assessment (RAE) process?
UK Digital Curation Centre • Delivering services • Development activities • Research agenda • Outreach Programme • http://www.dcc.ac.uk/
DCC people (some of them…) Management & Co-ordination Director Chris Rusbridge (University of Edinburgh) Community Support & Outreach Led by Dr Liz Lyon (UKOLN, University of Bath) Service Definition & Delivery Led by Professor Seamus Ross (HATII, University of Glasgow) Development Led by Dr David Giaretta (Astronomical Software & Services, CCLRC) Research Led by Professor Peter Buneman (Informatics, University of Edinburgh)
User requirements analysis: some sound bytes… R&D issues: Annotation services, Ontology development, Automating metadata creation, Tools and toolkits, Data Format Description Language, Identifiers, Registries, Economic and cost-benefits studies Advisory services:“Ask-a-Curator”,FAQs, reports, briefings, awareness-raising materials, best practice guidance, Storage media, “Like Erpanet”, advise Government, Research Councils, funding bodies Professional development: Short courses, conferences, seminars, workshops, secondments to DCC and to working repository services Outreach: Leadership for the future, case studies, sharing solutions, collaboration with other partners, international peers, industry links Taxonomy of “Users”
Advisory services • Responses to queries—from legal to technical guidance HELPDESK@dcc.ac.uk • FAQs constructed • Some useful resources…..
Digital Curation Manual • A world class resource • Constructed from topic-specific chapters • written by international experts • editorial board comprising leading researchers and practitioners • 45 initial topics including • Metadata, Appraisal and Selection; Costs; Freedom of Information; Interoperability; the OAIS Reference Model; Preservation Strategies; and Open Source • Briefing Papers aimed at senior managers
Workshops and Information Days • 2005 Workshop Programme • Persistent identifiers • Institutional repositories • Cost models • Preservation of medical databases • Information Days at Bath, Aberystwyth, London, Glasgow, Belfast (1st December)…..???
DCC: Development • “DCC Approach to Digital Curation” based on the Reference Model for an Open Archival Information System (OAIS); ISO standard, 14721: • Monitoring international standards • Development of a Representation Information (RI) registry/repository (DCC-RR) • Recommendations for tools and methods for generating Representation Information • Creating test-beds for digital curation tools Development info – see http://dev.dcc.ac.uk for details of Wiki and email list open to all
Trusted digital repositories • Audit Checklist for Certification • Draft Report August 2005 • Research Libraries Group RLG-NARA Taskforce • Defined criteria under 4 categories • Organisation • Functions, processes & procedures • Designated community & usability • Technologies & technical infrastructure
The database picture Curated data: classified, cleaned, annotated, integrated, cross-linked Source data
www.ijdc.net • Peer-review Editorial Board • Peter Buneman Editor (research) • Production editor Richard Waller • Papers for submission are very welcome! • 1st issue soon….
DCC Conferences • 1st International DCC Conference, Bath, Sept • Keynote speakers • Clifford Lynch CNI • Graham Cameron EBI • Presentations available • PV 2005 Edinburgh NOW • 2nd DCC Conf Nov 2006
Associates Network • Goals • Develop understanding, share best practice, advance research, promote recognition, develop consensus • Membership • International groups, national bodies, industry partners, funders, research groups, HEIs, FEIs, individuals…… • Benefits • Early access to R&D outputs, advisory services, training, input to definition and design, community participation • Discussion Forum www.dcc.ac.ukPlease join us!