510 likes | 732 Views
A Logical Framework for Metadata Interoperability. 16th August 2007. The Advanced Digital Library Seminar 2007 Guilin, China. Contents. 1. Metadata interoperability in libraries 2. Metadata interoperability 3. Two interoperability prototypes 4. Consistency issues
E N D
A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China
Contents • 1. Metadata interoperability in libraries • 2. Metadata interoperability • 3. Two interoperability prototypes • 4. Consistency issues • 5. Models and interoperability • 6. Semantics and syntax of metadata • 7. Interoperability across models • 8. Dublin Core application profile • 9. Discussion: library metadata practices
1. Metadata interoperability in libraries Metadata interoperability is increasingly accepted as something that libraries have to deal with.
New challenges for libraries • A new information environment • Variety of resources • Distributed locations • Heterogeneous structures
New challenges for libraries • New library services – putting everything together
New challenges for libraries • New library system architecture
WorldCat • WorldCat achieves interoperability across geographically distributed library catalogues with a unique metadata system, MARC21. • Centralized architecture.
Z39.50 • Z39.50 performs interoperability with decentralized architecture.
2. Metadata interoperability When we talk about interoperability in relation to metadata, we are generally talking about search interoperability, or the ability to perform a search over a diverse set of metadata records to obtain meaningful results. (Caplan, 2003) Caplan, P. (2003). Metadata Fundamentals for All Librarians: American Library Association
Levels of metadata interoperability • Schema level – focused on interoperability between elements of the schemas. • Record level – integrate and convert metadata records. • Repository level – harvested or integrated records from varying sources, mapping value strings. (Chen & Zeng, 2006) Chen, L. M., & Zeng, M. L. (2006). Metadata Interoperability and Standardization – A Study of Methodology Part I. D-Lib Magazine, 22(6).
Machine-understandable • When does “metadata interoperability” become a problem? • Human understandable • Machine understandable • The core issue of metadata interoperability is how to make the relationships among metadata systems machine-understandable.
Machine-understandable Real world Human understandable Metadata Machine world Machine understandable
3. Two interoperability prototypes • There are two basic mechanisms to achieve interoperability between metadata systems: mapping and integrating. • Mapping is a process of matching original metadata systems to target metadata systems. It assigns a binary relationship between a pair of members in different metadata systems. • Integrating is a process of combining different metadata systems together into a unique metadata system. The principle of the federating is integrating different metadata systems into one metadata system, both on the record level and the element level, and making them work together.
Mapping andIntegrating Crosswalk Mapping Switching schema Metadata Interoperability Metadata packages Warwick Framework Integrating Element reusing Application Profile
Crosswalk Metadata A Metadata B Element 1 Element 1 Element 2 Element 2 Element 3 Element 3 Element n Element n
Switching schema GEM MARC-XML Interoperable Core Dublin Core MARC EDA ONIX Godby, C. J., Smith, D., & Childress, E. (2003). Two paths to interoperable metadata. Retrieved 31 July, 2005, from http://www.oclc.org/research/publications/archive/2003/godby-dc2003.pdf
Metadata packages: Warwick Framework • Metadata module • Container • Packages • Metadata set • Indirect • Container Lagoze, C. (1996). The Warwick Framework [Electronic Version]. D-Lib Magazine. Retrieved 8 July,2007 from http://www.dlib.org/dlib/july96/lagoze/07lagoze.html.
Application profiles • The principle of the application profile approach is the reuse of existing metadata schemas. • Application profiles consist of data elements drawn from one or more namespace schemas combined together by implementors and optimised for a particular local application. (Heery & Patel, 2000) Heery, R., & Patel, M. (2000). Application profiles: mixing and matching metadata schemas [Electronic Version]. Ariadne. Retrieved July 8, 2007 from http://www.ariadne.ac.uk/issue25/app-profiles/intro.html
Canada government metadata project: an example of application profiles Devey, M. & Côté, M. (2006). The development and use of metadata application Profiles: the Government of Canada experience. The Serials Librarian, 51(2)
4. Consistency issues • “ Two communities may agree about the meaning of the term title or creator or identifier, but until they have a shared convention for identifying and encoding values, they cannot easily exchange their metadata. ” (Duval, Hodgins, Sutton, & Weibel, 2002) Duval, E., Hodgins, W., Sutton, S., & Weibel, S. L. (2002). Metadata Principles and Practicalities [Electronic Version]. D-Lib Magazine, 8. Retrieved July 10, 2007 from http://www.dlib.org/dlib/april02/weibel/04weibel.html.
Barriers to interoperability • Semantic inconsistency e.g. dc.Title vs. MARC.245 dc.Creator vs. EAD. Author • Syntax inconformity e.g. ISO2709 vs. XML • Representation inconformity e.g. Hillmann, Diane I. vs. Hillmann, D. I. • Vocabulary inconformity e.g. LCSH vs. MeSH
5. Models and interoperability • A metadata model is an abstract construct that represents metadata by a set of components and a set of logical relationships between them. • A metadata model may help us to gain a better understanding of metadata and its relationship with the real word, as well as the relationship within an encoding system that is machine-understandable. • A metadata model facilitates the development of better interoperability between metadata systems.
Models and interoperability Interoperability Metadata System Real World Machine World (i.e. computer) Better understanding Better understanding Metadata Model Interoperability Metadata System
DC Abstract Model (DCAM) • Dublin Core Abstract Model is for encoding metadata. It acts as a grammar for Dublin Core. • DCAM is intended to be independent from any specific encoding syntax, such as RDF framework, and to be an universal meta syntax model for metadata encoding.
Principles of DC Abstract Model (DCAM) • One to one principle A description describes exactly one resource. • Dumb-down principle Refinements principle for narrower-broader relationship consistency • Appropriate values Well-typed values to ensure that the usage ofmetadata elements in a particular context will be well guided
One to one principle: Based on FRBR Model Group 3 : Subject Description C Title ISBN Subject A Book Terms Scheme Publisher Description B Name Subject Author Author Address Description A Group 1 : work Group 2 : People, Co.
DCAM resources model • Any object that the metadata is intended to describe is a resource. • A resource is composed of property/value pairs. • Each value is either a literal value or a non-literal value. literal Resource Property Value Non-literal
Property/value pair with different formats • Property/value pairs are the core of the DCAM. They are the minimal semantic units of metadata.
DCAM description set model • Property/value pairs are represented by a statement which is made up of a property URI/value surrogate pair. • A set of statements, which describes a resource, is a description. • A set of descriptions is a description set. Description Set Description Statement Property URI Value Surrogate
Description set model Description Set Description Statement Property -Value
DCAM vocabulary model • Vocabulary is a set of terms. The terms denote the semantics both of the property and the value. • In the DCAM model, the vocabulary may be one of three things, value vocabulary, property vocabulary, or class. • Property vocabulary includes metadata elements set and it’s definitions. • Value vocabulary includes vocabulary encoding schemes, e.g. LCSH, DDC; syntax encoding schemes, e.g. 2007-1-12 or 1-12-2007 or 12-1-2007.
DCAM : a summary Domain Model Syntax Model
6. Semantic and syntax of metadata • The semantics of metadata is about machine understandable meanings of the metadata. • All of machine understandable meanings are inherited from human understandable meanings. So the human understandable meaning space of the metadata is also a significant part of the metadata model. • The vocabulary model of metadata is the human understandable semantic space that defines and identifies the attributes and content of the resource. It is also a mechanism which transforms the meanings between humans and machines.
Domain models • The meanings of the two functional components, property and value, are based on a general view of the resource which the metadata intends to describe. • Property denotes the attributes of the resource, therefore, definitions of property relies on how the resource is structured into a set of attributes. • The value is extracted from the resource under guidelines for selecting, formulating and presenting values. • The general view of the resource can be indicated by domain models, such as FRBR in the bibliography domain and CIDC CRM in the cultural heritage domain.
Syntax of metadata • The syntax of metadata is about machine processable formats. • Metadata should be encoded into a machine readable format e.g. MARC, XML, RDF, etc. following standards for encoding. These standards are designed to provide a common way to describe metadata information and make it easily read and understood by computer applications. • The framework for encoding metadata into a particular kind of format is the syntax of the metadata.
7. Interoperability across models • Metadata interoperability may occur between each layer of the metadata models. Semantic Domain Model Domain Model Semantic Vocabulary Model Property Model Property Model Vocabulary Model Value Model Value Model Resource Model Resource Model Syntax Description Set Model Description Set Model Syntax Syntax Model Syntax Model
Semantic interoperability • Interoperability at the semantic layer makes agreement between semantic models, such as metadata element sets, encoding schemes and, at the top level, domain models. It must be seriously considered when achieving an interoperability at schema level. • Terminology Harmonization • Elements Mapping • Common Ontology
Terminology harmonization • The simplest way to achieve semantic interoperability is by extracting a common set of terminology for metadata standards, (Pierre & LaPlant, 1998) which is called harmonization. • However, terminology can not carry whole meanings for achieving semantic interoperability. Sometimes, the precise definition of the same terms are implemented differently. Pierre, M. S., & LaPlant, W. P. (1998). Issues in Crosswalking Content Metadata Standards [Electronic Version]. Retrieved July 8, 2007 from http://www.niso.org/press/whitepapers/crsswalk.html.
Interoperability between domain models • To achieve perfect semantic interoperability, the property in each of the metadata systems should be defined under the same conceptual framework. This common conceptual framework is a common ontology. • An ontology is a formal explicit description of concepts. It consists of classes, slots, facets, and instances. An ontology of metadata consists of a set of individual instances of concepts, and constitutes a knowledge base for metadata interoperability. • A common ontology can sufficiently present the semantics of metadata using knowledge representation languages, such as OWL et al., and achieve common understanding of metadata across different domains.
ABC model, example of common ontology • ABC ontology, developed within Harmony International Digital Library Project, provides a common conceptual model to facilitate interoperability between metadata ontologies from different domains. Lagoze, C., & Hunter, J. (2001). The ABC Ontology and Model [Electronic Version]. Journal of Digital Information, 2. Retrieved July 2007 from http://journals.tdl.org/jodi/issue/view/10.
Interoperability between domain models • Ontology mapping between ABC and CRM (Doerr, Hunter, & Lagoze, 2003) Doerr, M., Hunter, J., & Lagoze, C. (2003). Towards a Core Ontology for Information Integration [Electronic Version]. Journal of Digital Information, 4. Retrieved July, 2007 from http://journals.tdl.org/jodi/article/view/jodi-109/91.
Interoperability between domain models • Extending ABC with MPEG-7 Hunter, J. (2003). Hunter, J. (2003). Enhancing the semantic interoperability of multimedia through a core ontology. IEEE Transactions on Circuits and Systems for Video Technology, 13(1), 49-58.
Syntax interoperability • The interoperability at the syntax layer is to ensure format conformity. Syntax or format inconformity often happens during metadata record conversion at the record level of metadata interoperability. For example, when we convert a MARC record into MARCXML format, we must follow exactly the syntax of MARCXML. e.g. 24510 |a Arithmetic / |c Carl Sandburg. <datafield tag="245" ind1="1" ind2="0"> <subfield code="a">Arithmetic /</subfield> <subfield code="c">Carl Sandburg</subfield> </datafield> <datafield tag="245" ind1="1" ind2="0"> a Arithmetic / |c Carl Sandburg. </datafield>
Syntax interoperability • The approach to metadata interoperability at the syntax level is now more focused on developing syntax-independent data models. The models could be equally applicable in a variety of syntax contexts. According to such models, metadata property/value pairs and their statements can be encoded with different encoding frameworks, such as RDF, <META>, SHOE etc. .
URI and interoperability • In DCAM, a value in the property/value pair could be another resource, which is identified by URI. It means that a resource could be described by another resource. This principle provides a framework which links different metadata description sets together Metadata A Property Value URI Resource A MetadataB Property URI Value Resource B
8. Dublin Core application profile • CEN MMI-DC Workshop, a research group of the European Committee for Standardization, developed application profile guidelines to give guidance on how to create a customized or adapted metadata schema for a particular application. DCAP is based on Dublin Core. It allows people to extend Dublin Core by drawing the elements from other existing metadata schema or creating new elements to meet particular needs. • DCAP follows DCAM, and especially follows the Principle of Appropriate Identification, Principle of Readability. (CEN MMI-DC Workshop, 2003) CEN MMI-DC Workshop. (2003). Dublin Core Application Profile Guidelines. Retrieved July, 2007, from ftp://ftp.cenorm.be/PUBLIC/CWAs/e-Europe/MMI-DC/cwa14855-00-2003-Nov.pdf
Dublin Core application profile CEN MMI-DC Workshop. (2003). Dublin Core Application Profile Guidelines. Retrieved July, 2007, from ftp://ftp.cenorm.be/PUBLIC/CWAs/e-Europe/MMI-DC/cwa14855-00-2003-Nov.pdf
9. Discussion: library metadata practices • Information organizing paradigm in library community shifts • Single metadata schema to multiple metadata schemas. • Application profile becomes cornerstone of information organization in library community. • Working for both users and machines – new library professional.