170 likes | 364 Views
Digital Curation Centre. a centre of expertise in data curation and preservation. Digital Curation Centre: tools and services under development. David Giaretta Associate Director (Development). Funders:. Organisation. curation organisations eg DPC. communities of practice: users. UKOLN.
E N D
Digital Curation Centre a centre of expertise in data curation and preservation Digital Curation Centre: tools and services under development David Giaretta Associate Director (Development) Funders:
Organisation curation organisations eg DPC communities of practice: users UKOLN Collaborative Associates Network of Data Organisations U of Edinburgh U of Glasgow U of Edinburgh research collaborators CCLRC testbeds& tools Industry standards bodies
Organisation curation organisations eg DPC communities of practice: users community support & outreach Collaborative Associates Network of Data Organisations service definition & delivery management & admin support research collaborators research development co-ordination testbeds& tools Industry standards bodies
Overview • Developing tools and services which will be needed in the short-medium term • integrating tools from many sources • Will be new DCC services as well as useable separately by other projects • Strongly OAIS based • Support automated processing & interoperability
Representation Information vs Format • Format = Structure • Omits important information e.g • Language, terminology • Encryption • Need to know more than just Format in order to stand a chance of being in a position to use the information
Layered Model from OAIS More easily applicable to Science data
Representation Information - High Level View Example of use of Representation Information Labelling
Registry/Repository • Interface and protocols – JAXR “standard” • freebXML implementation • many access methods • URL • Web Services • API • Etc.. • Findability • Persistent IDs • What can we rely on? • Labels (to support automated processing) • Initial service this Summer • Hope to work with PRONOM 4 & GDFR
Registry/ Repository • Trusted repository of Rep. Info • Authenticity of info • Access control • Certificates/Digests : (are they trustable over the long term?) • Extensibility • Distributed
Certification • RLG task force preparing draft standard • Based on OAIS (plus TDR) • Expect this to become an ISO standard • Tool: • Checklist and reports • … • Awaiting release of draft (in May)
Archival Information Package • METS • XFDU Packaging • Expect tools available by end of year
Preservation Description Info Will be working with PREMIS on tools
DCC Development Roadmap for next 6-12 months • Registry • Complete phase 1 • Include links to TNA/PRONOM • Hand over to Services group • Start Phase 2 – aim for “Trusted Repository” status • Representation Information: • Data descriptions of science data using EAST (http://east.cnes.fr) & others • Import other Structure description tools and Data Dictionary tools • Develop Mapping to data object level • Work with other projects e.g. Emulation, Processing • Certification • Draft certification • Checklist • Proposed standard • Additional Tools • Metadata extraction tool set • Ingest tool (based on PAIMAS standard) • Testbeds e.g. large scale data management tools
Summary • Developing and integrating OAIS based tools • Reviewing other related tools • See http://www.dcc.ac.uk • also Development Web site (http://dev.dcc.rl.ac.uk) with a Wiki and associated open email list have been set up. • aim to encourage widest possible collaboration with other projects. • In medium-long term expect tools from DCC Research activities e.g. Annotation