250 likes | 745 Views
Metadata interoperability issueMuseums metadata paradigmRDA, working on the
E N D
1. Developments in Metadata Interoperability: Museums and Localisation Industry paradigms Dr. George N. Kordelis
2. Metadata interoperability issue
Museums metadata paradigm
RDA, working on the “global metadata standard”
Localisation paradigm (as a business metadata example)
The XLIFF standard and its potential application as localisation metadata standard to all other formats.
Summary
Overview
3. Various metadata schemes and element sets
Other are well known & documented
Other are less known and used in “special” cases
Similar or the same content is described by different metadata standards
No canonical metadata record for an object
Varied syntaxes for encoding metadata
This situation leads to :
A very rich and diverse metadata ecology!
Problems in a networked environment
Metadata interoperability issue (I) “The problem space”
4. In a networked environment:
Interaction between systems during harvesting and searching
Integrating different types of metadata even for local information management (i.e. inside a library or a museum’s LAN)
Interoperability
"Interoperability is the ability of multiple systems with different hardware and software platforms, data structures, and interfaces to exchange data with minimal loss of content and functionality" [NISO, 2004].
"Interoperability is the ability of two or more systems or components to exchange information and use the exchanged information without special effort on either system" [CC:DA, 2000]. Metadata interoperability issue (II)“The problem space”
5. Schema level – Efforts are focused on the elements of the schemata, being independent of any application.
Derivation (e.g. MARC? MARCXML, MARCLite, MODS) - Application Profiles - Crosswalks (absolute and relative) - Switching-across - Metadata Registry
Record level – Efforts are intended to integrate populated metadata records through the mapping of the elements according to the semantic meanings of these elements.
Conversion of Metadata Records (e.g. MARC ??MODS) - Data Reuse and Integration (e.g. Resource Description Framework -RDF)
Repository level – With harvested or integrated records from varying sources, efforts at this level focus on mapping value strings associated with particular elements (e.g., terms associated with subject or format elements). The results enable cross-collection searching.
Metadata Repository Based on the Open Archives Initiative (OAI) Protocol - Metadata Repository Supporting Multiple Formats Without Record Conversion - Aggregation Metadata interoperability issue (III)“Solutions”
6. Descriptive or Content metadata in Museum
Museum Collections Management/Documentation Standards
CHIN Data Dictionaries
SPECTRUM (Standard ProcEdures for CollecTions Recording Used in Museums)
CIDOC Guidelines for Museum Object Information: The CIDOC Information Categories
Collections Description Standards
Collection-level description
RSLP Standard for Collection-level description (based on DC)
Description of Art Collections and/or Visual Resources
Categories for the Description of Works of Art (CDWA)
VRA Core Categories
Méthode d'inventaire informatique des objets beaux-arts et arts décoratifs
RLG REACH Element Set
Le catalogage des estampes
Description of Architecture, Archaeological Sites/Monuments
A Guide to the Description of Architectural Drawings
CIDOC International Core Data Standard for Archaeological Sites and Monuments
CIDOC International Core Data Standard for Archaeological Objects
MIDAS (Monument Inventory Data Standard)
Méthode d'inventaire informatique - Archéologie.
Description of Ethnological/Anthropological collections
CIDOC International Core Data Standard for Ethnology/Ethnography
Content Standard for Anthropological Metadata
Méthode d'inventaire informatique - Ethnologie.
La documentation des ensembles
Handbook of Standards Documenting African Collections / Manuel de normes : documentation des collections africaines.
Description for Object Identification and Security
Object ID: Protecting Cultural Objects in the Global Information Society Museums metadata paradigm(I)
7. General Metadata Standards for Resource Discovery
Dublin Core, The Dublin Core Metadata Element Set
Museum use a discipline-specific standard (CHIN Data Dictionaries or SPECTRUM) in order to document and manage their collections, and extract a subset of their collections records which map to the Dublin Core Elements.
Darwin Core
Darwin Core (DwC) is a "profile describing the minimum set of standards for search and retrieval of natural history collections and observation databases".
Multimedia Metadata Standards (NISO NISO Z39.87-2002 Technical Metadata for Digital Still Images, DIG35 Specification, MPEG-7, Video Development Initiative (ViDe) User's Guide: Dublin Core Application Profile for Digital Video)
Metadata Standards for Digital Preservation [RLG Preservation Metadata Elements, Metadata for Long-Term Preservation, Metadata Encoding and Transmission Standard (METS)…]
Intellectual Property Rights and Electronic Commerce Standards ([NDECS (Interoperatibility of Data for Electronic Commerce Systems), MPEG-21, Digital Object Identifier (DOI)…] Museums metadata paradigm(II)
8. Museums and the Network: Why
Museums that want to convert their data from one format to another (for example, moving data into a new collection’s management system)
Museums that want to exchange data with another organization using a different metadata standard
Several museums that wish to collaborate to create a collective or distributed resource that allows seamless searching by users
Museums using internally more than one standard to meet their various needs for documentation, management, security, and access
Implemented Solutions: Crosswalk of Metadata Element Sets
The Getty Research Institute "Crosswalk of Metadata Element Sets for Art, Architecture, and Cultural Heritage Information and Online Resources".
CHIN Humanities Data Dictionary Museums metadata crosswalks
9. Joint Steering Committee for Revision of AACR (JSC) is working towards a new standard: RDA: Resource Description and Access, scheduled for release in early 2009.
RDA is a new standard for resource description and access, designed for the digital environment
A flexible framework for describing all resources - analog and digital
Data readily adaptable to new and emerging database structures
Data compatible with existing records in online library catalogues
Globalizable and Localizable content standard covering all media
Independent of technical communication formats
Aimed at everybody who needs to find, identify, select, obtain, use, manage and organize information
RDA and other standards
RDA/ONIX framework for resource categorization
RDA/MARC21 mapping
RDA/Dublin Core mapping
RDA: Resource Description and Access
10. OASIS-XLIFF (Organization for the Advancement of Structured Information Standards - XML Localization Interchange File Format) has emerged as a standard interchange file format for localization-related data and metadata.
Localization challenges:
Insufficient interoperability between tools.
Lack of support for overall localization workflow.
Localization tools developers and users need to deal with various formats.
Large number of proprietary intermediate formats. Localisation paradigm (as a business metadata example)XML Localization Interchange File Format
11. XLIFF Advantages – Localization Customer, Tools Vendor, Service Provider
Single format for adjunct processing (e.g. quality control in terms of spell checking).
Less dependency on vendors which are able to work with special formats.
Tighter control on what goes to localization (Pre-filtering of what to translate or not).
Controlled information flow (author/developer notes, item properties, etc.).
All advantages of XML-based processing (e.g. ID-based leveraging)
Focus on development of core functionality rather treatment of source format.
Open and standard solution for proprietary formats.
Global implementation of utilities (e.g. one spell checker for both RTF and HTML).
Localisation paradigm (as a business metadata example)XML Localization Interchange File Format
12. An XLIFF document can capture anything needed for a localization project:
Localizable objects (e.g. text strings) in source and target languages.
Supplementary information (e.g. glossaries, or material to recreate the original format).
Administrative information (e.g. workflow data).
Custom data (e.g. initialization information for tools).
The High Level View
13. The XLIFF Document An XLIFF document is designed to store the extracted data related to localization.
Each given source container (e.g. a file, a database table, and so forth) corresponds to a <file> element in XLIFF.
Each XLIFF document can include several <file> elements.
A whole localization project can possibly be stored in a single XLIFF document.
14. Bilingual Model Each <file> element is designed to store one source language and one target language.
The rationale is that the translation for every target language is done by different people most of the time.
However, languages in <alt-trans> element can be different. For example, proposed matches in national Portuguese when translating into Brazilian Portuguese. The TC envisioned the XLIFF file as being a container of localisable data handed off to a translator, and returned from same translator. Thus, the bilingual model was conceived. Note, however, that the bilingual model is strongly suggested, but is actually not a requirement of the XLIFF spec. Source and Target language pair limits to bilingual the contents of a given <file>, but there’s nothing to stop an implementer from having multiple <file> within the same XLIFF document, each with a different language pair (ie, en + es, fr + de, jp + ko, etc…), thus providing multilingual support by cheating (well, sort of). To do so is not prohibited, however, but it is assumed that tools publishers will support only bilingual content within an individual XLIFF document. Thus, it is strongly recommended that implementations limit the contents of each XLIFF document to a single source-target language pair.
The TC envisioned the XLIFF file as being a container of localisable data handed off to a translator, and returned from same translator. Thus, the bilingual model was conceived. Note, however, that the bilingual model is strongly suggested, but is actually not a requirement of the XLIFF spec. Source and Target language pair limits to bilingual the contents of a given <file>, but there’s nothing to stop an implementer from having multiple <file> within the same XLIFF document, each with a different language pair (ie, en + es, fr + de, jp + ko, etc…), thus providing multilingual support by cheating (well, sort of). To do so is not prohibited, however, but it is assumed that tools publishers will support only bilingual content within an individual XLIFF document. Thus, it is strongly recommended that implementations limit the contents of each XLIFF document to a single source-target language pair.
15. Localizable Objects XLIFF allows not only text string localization but also localization of other object types such as graphics.
Supplementary information can be represented in a generic way through inline codes (e.g. formatting of text).
Relationship between objects can be captured (e.g. all items in a menu).
16. An XLIFF Snippet… A simple menu represented as XLIFF
20. References
21. References
22. References