180 likes | 292 Views
A CIDOC CRM – compatible metadata model for digital preservation. Information Systems and Databases Laboratory Department of Informatics Athens University of Economics and Business. Panos Constantopoulos and Vicky Dritsou. Structure of the presentation.
E N D
A CIDOC CRM – compatible metadata model for digital preservation Information Systems and Databases Laboratory Department of Informatics Athens University of Economics and Business Panos Constantopoulos and Vicky Dritsou
Structure of the presentation • Introduction to Digital Preservation • Metadata • Existing proposals • A conceptual preservation metadata model • Properties of the model • Model concepts • Schema • The complete model • Conclusion • Further research CIDOC CRM Workshop
Introduction to Digital Preservation (1/2) • Two types of perils for digital content exist • Physical: physical destruction of file systems, corruption of digital media, fire, earthquake • Technological: obsolete systems, non-compatible systems, software and formats • Physical perils are more straightforward to confront • By saving multiple copies of digital content: • On different media • At different geographic locations • Technological hazards require a more complex policy to be applied • By following the appropriate preservation strategy CIDOC CRM Workshop
Introduction to Digital Preservation (2/2) • Digital preservation strategies for technological hazards • Information migration • Technology emulation • Technology preservation • Backwards compatibility • Reliance on standards • Encapsulation • Transformation to non-digital form • Digital archeology • Most strategies require some information to be collected and stored • This is achieved by using metadata CIDOC CRM Workshop
Metadata • Defined as “data for data” or otherwise “information about information” • Metadata properties • Not necessarily digital • Not autonomous • Digital information needs to pre-exist • Supplementary • Dynamic character • Metadata types • Descriptive • Structural • Administrative • Preservation metadata • They contain elements from all tree types • But which metadata should we choose? CIDOC CRM Workshop
Existing proposals • Several approaches exist • We have studied five widely known ones: • Dublin Core • Open Archival Information Systems (OAIS) • Curl Exemplars Digital Archives (CEDARS) • Pittsburgh Project • National Library of Australia (NLA) • Discussion • None contains inter-related concepts (element lists) • DC: Access-oriented, inadequate • OAIS, CEDARS: very detailed, difficult to use • PP: detailed, necessary/optional elements, use instructions • NLA: Structured elements, object types CIDOC CRM Workshop
A conceptual preservation metadata model • A parsimonious metadata set derived from • comparison of the afore-mentioned proposals • CIDOC CRM • Metadata elements • Title - Information Carrier • Identifier - Activity • Subject - Right • Language - Actor • Type - Effect • Format - History • Technical Equipment • Relations CIDOC CRM Workshop
Properties of the model • Each element forms a concept • Contains relationships among concepts • Results in a conceptual model • Compatible with CIDOC CRM • A small number of new concepts • Can serve as an application ontology • A guide for preservation • Independent from preservation strategy • Elements contain all the information required from each strategy • Further details can be added with the extension of concepts CIDOC CRM Workshop
Model concepts (1/3) • Main concept: Digital Object • Subclass of E73 Information Object • Has attributes: Title, Subject, Type, Size, Identifier, Language, Digital Content • Identifiers may be global or local • Global identifiers must be unique • Digital Content allows separation of content from descriptive/administrative aspects • Stored in an Information Carrier • Digital Objects can consist of other digital objects (Complex Objects) • Type: image, text, sound, multimedia,… • Each object type can be formatted in one of a number of specific formats CIDOC CRM Workshop
Schema (1/3) CIDOC CRM Workshop
Model concepts (2/3) • Activities have digital objects as input and output, are carried out by Actors and are subject to Rights • Activity types: • Creation • Deletion • Modification • Alteration • Copy • Read • In all of them, except from Read and Deletion, we assume that the output is a new object • We keep the sequence of performed Activities by assigning the appropriate attribute • Effects can be used as a space-saving device when versions need not be kept CIDOC CRM Workshop
Schema (2/3) CIDOC CRM Workshop
Model concepts (3/3) • Activities require the appropriate Technical Equipment to be performed • Software • Hardware • These are all specializations of E71 Man-Made Thing • The software needed depends on the Type and Format of the object • Information carrier also requires Technical Equipment • For reading the object • For writing the object CIDOC CRM Workshop
Schema (3/3) CIDOC CRM Workshop
The complete model CIDOC CRM Workshop
Conclusion • Metadata elements drawn from existing metadata sets • Conceptual model for digital preservation • Previous works included only lists of metadata elements • Extensible as needed • Compatible with CIDOC CRM Digital objects as • digital surrogates of non-digital objects • cultural objects by themselves CIDOC CRM Workshop
Further research • Historical processes: • interpretation • CIDOC CRM domain of application • Preservation processes: • decision and production processes • Prescription and monitoring • Explore differences in modelling requirements CIDOC CRM Workshop
Thank you for your attention! CIDOC CRM Workshop