270 likes | 419 Views
A methodology for Sharing Archival Descriptive Metadata in a Distributed Environment. Outline. The Nature of Archives Network of Digital Archives Digital Libraries Technologies and Digital Archives Encoded Archival Description Metadata Format Nested Sets Methodology Conclusions. Outline.
E N D
A methodology for Sharing Archival Descriptive Metadata in a Distributed Environment
Outline • The Nature of Archives • Network of Digital Archives • Digital Libraries Technologies and Digital Archives • Encoded Archival Description Metadata Format • Nested Sets Methodology • Conclusions
Outline • The Nature of Archives • Network of Digital Archives • Digital Libraries Technologies and Digital Archives • Encoded Archival Description Metadata Format • Nested Sets Methodology • Conclusions
Archives • Archives keep the context and the network of relationships. • Archives have a hierarchical structure:archival bond. • Archival descriptions need to be able to express and maintain hierarchical structure and relationships.
Archival Descriptions • TheInternational Council on Archives has developed a general standard for archival description called International Standard for Archival Description (General) ISAD(G) • Archival descriptions produced according to the ISAD(G) standard take the form of a tree which represents the relationships among more general and more specific archive units going from the root to the leaves of the tree. Fonds Sub- Fonds Sub- Fonds Series Series Series Reference. International Council on Archives. ISAD(G): General International Standard Archival Description, 2nd edition. Ottawa: International Council on Archives, 1999. Items Items
Outline • The Nature of Archives • Network of Digital Archives • Digital Libraries Technologies and Digital Archives • Encoded Archival Description Metadata Format • Nested Sets Methodology • Conclusions
Archival Descriptive Metadata • Archival descriptive metadata should meet the following three main requisites: • Context: archival descriptive metadata have to retain information about the context of a given record. • Hierarchy: archival descriptive metadata have to reflect the archive organization which is described in a multi-leveled fashion. • Variable Granularity: archival descriptive metadata have to facilitate access to the requested items.
Network of Digital Archives Archive B Heterogeneity issues. Archives have a fixed tree structure. Archives must preserve their autonomy and independence. Difficulties in exchanging archival information embedded in a tree hierarchy. Archive E ArchiveA ArchiveC Archive D
Trees mapped into Sets Archive descriptions assume a tree structure. It is difficult to share trees between archives and to access a precise element of the tree without accessing the whole hierarchy. Fonds Sub- Fonds Sub- Fonds Sub- Fonds Serie Serie Serie Serie Serie Serie
Nested Sets Model • Sets permit to access elements with a variable granularity. • Throughout nested sets it is possible to express hierarchy and retain context information. • An organization of nested sets is flexible and well-suited for a distributed environment. Fonds Sub-fonds Sub-fonds Sub-fonds Serie Serie Serie Serie Serie Serie
Outline • The Nature of Archives • Network of Digital Archives • Digital Libraries Technologies and Digital Archives • Encoded Archival Description Metadata Format • Nested Sets Methodology • Conclusions
Digital Libraries • DLSs are the technology of choice for managing the information resources of different kind of organizations. • The need for interoperability among different systems is a compelling issue • DELOS Reference Model. • Europeana the European digital library, museum and archive is a 2-year project that will give users direct access to some 2 million digital objects. This figure is taken from Europeana leaflet available at: http://www.europeana.eu
OAI-PMH • Open Archive Initiative promotes interoperability through OAI-PMH. • Dublin Core metadata format is the lowest common denominator in OAI-PMH. • OAI-PMH is the de-facto standard in metadata exchange. It is based on the distinction between two main components: Data and Service Provider.
OAI Sets • OAIsets enable logical data partitioning by defining group of records. • OAIsets are defined by three main components: • setSpec • setName • setDesc • OAIset organization may be flat or hierarchical. • Harvesting procedures: incremental and selective harvesting. • Harvesting from a set which has subsets will cause the repository to return metadata in the specified set and recursively from all its subsets.
Digital Libraries and Digital Archives • The use of OAI-PMH is not widespread in the archival context. • Dublin Core metadata format seems to flatten out the archive structure. • EAD: Encoded Archival Description. • EAD is a standard defined by The Library of Congress in partnership with the Society of American Archivists. • EAD reflects and emphasizes ISAD(G).
Outline • The Nature of Archives • Network of Digital Archives • Digital Libraries Technologies and Digital Archives • Encoded Archival Description Metadata Format • Nested Sets Methodology • Conclusions
EAD Structure and Puzzles <ead> <eadheader> [...] </eadheader> <archdesc level=”fonds”> [...] <did> [...] </did> <dsc> [...] <c01> [...] </c01> <c01> [...] <c02> [...] </c02> </c01> </dsc> </archdesc> </ead> • Automatic processing: Several degree of freedom in tagging practice. • Levels: The level of description needs to be inferred by navigating the upper components. • Size: Sharing and searching archival description might be made difficult by the high size of EAD and its deep hierarchical structure. • User needs: Users are often interested in item-level information which is typically buried very deeply in the hierarchy and difficult to reach. • Archival metadata requirements: EAD complies with both the context and hierarchy requirements but it disregards the variable granularity one.
Outline • The Nature of Archives • Network of Digital Archives • Digital Libraries Technologies and Digital Archives • Encoded Archival Description Metadata Format • Nested Sets Methodology • Conclusions
Benefits of the Nested Sets Methodology • The methodology addresses the shortcoming of EAD when it was used in a distributed environment and with variable granularity access to the resources. • EAD items are mapped into different DC metadata which are shareable and natively supported by OAI-PMH. • Context and hierarchy are expressed in a straightforward manner exploiting native functionalities of OAI-PMH levering the role of OAISets. • This approach keeps archival metadata independent of the original EAD file, without loosing any context information. • This approach can be applied also independently of the EAD standard; indeed we can also create archival description metadata from scratch by exploiting OAI sets and DC records.
Nested Sets Methodology Internal nodes are mapped into sets.
Outline • The Nature of Archives • Network of Digital Archives • Digital Libraries Technologies and Digital Archives • Encoded Archival Description Metadata Format • Nested Sets Methodology • Conclusions
Conclusions • We defined the requisites which must be satisfied in order to obtain shareable metadata and to retain all the fundamental characteristics of archival resources. • We presented a methodology for creating shareable archival descriptive metadata which exploits the synergy between OAI-PMH and DC. This methodology opens archival description to be shared in a distributed environment. • EAD metadata can be mapped into our methodology without losing information. • The methodology can be applied backwards generating a new EAD file with a slightly different structure compared to the original one, but it brings the same informational content.
Conclusions Thank you! Questions? Gianmaria Silvello Department of Information Engineering University of Padova silvello@dei.unipd.it