300 likes | 454 Views
Applying Metadata to Multimedia. Eric Stiles. Based on: Multimedia Data Management Using Metadata to Integrate and Apply Digital Media. February 13, 2003. Multimedia Data Management Using Metadata to Integrate and Apply Digital Media. Susanne Boll, Wolfgang Klas, Amit Sheth 1994.
E N D
Applying Metadata to Multimedia Eric Stiles Based on: Multimedia Data Management Using Metadata to Integrate and Apply Digital Media February 13, 2003
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media • Susanne Boll, Wolfgang Klas, Amit Sheth • 1994 Short presentation of book by same name. Different views on how metadata can be modeled, classified, extracted, managed, and applied, to support a convenient handling of digital media. Also reviewed are the various standards for handling metadata.
Definition • The term "meta" comes from a Greek word that denotes something of a higher or more fundamental nature. Metadata, then, is data about other data. • The term refers to any data used to aid the identification, description and location of networked electronic resources
The Need for Multimedia Metadata • Allows for the ability to handle and capture large amounts of content data in efficient system • Metadata is easier to query with traditional methods than raw digital data. That being said ontologies play a more important role here than with textual information.
Metadata Classification • Content-independent • Content-dependent • Direct content-based • Content-descriptive • Domain-independent • Domain-specific
Another Classification • Gilliland-Swetland (1998)distinguishes between 4 types of Metadata: • Administrative • Metadata used in managing and administering information resources • Descriptive • Metadata used to describe or identify information resources • Preservation • Metadata related to the preservation management of information resource • Technical • Metadata related to how a system functions or metadata behave
Issues To Deal With • Different query paradigm • Inadequate processing techniques • Lacking efficiency • Semantics of multimedia data • Media specific metadata • Maintenance of metadata • update as raw data changes • changes as semantic knowledge changes (scalable)
Typical Questions To Ask • What are the characteristics of the media type considered and the domain of the desired application? • What are typical examples of metadata in the context of the media type? • What are the specific terms for the ontology? • What are the storage techniques of the media? • What are the best/existing standards of the media?
Image • Must understand image digitization • PixLabs The Science of Image Understanding
Video • Desire to select all or part of a video • Metadata (abstract) information can change depending on location in time of the video MPEG-7 uses XML Schema as the language of choice for content description. The adoption of XML Schema allows MPEG-7 applications to leverage a large body of existing tools, APIs and server technology built around World Wide Web Consortium-based XML standards.
Audio • Not only do we have time to deal with but also speech recognition. • Metadatamap CDDB sites • Muse • AMG • etc • SpeechBot: a Speech Recognition based Audio Indexing Systemfor the Web
Geographic Information • Issues with integrating heterogeneous databases. • Federal Geographic Data Committee • ANZLIC • Minnesota Geopraphic Iniative
Digital Libraries • Infrastructure development to work with various environments • Stanford Digital Library Multimedia Architecture
AMICO Metadata Conversion “Raw” Metadata files: - catdata (8 files), - tiffmetada (23 files), - thumbmeta (52,689 files) Consolidated Metadata files: - 1 catdata - 1 tiffmetadata - 1 thumbmeta Tape Read Merge Convert to XML Multiple XML files per museum 3 XML files: - 1 catdata - 1 tiffmetadata - 1 thumbmeta Split-by- museums 1 XML file per museum 1 XML file per museum Split-by- file size Multiple museum XML files per machine eXcelon Data Server eXcelon Data Server eXcelon Data Server Split-by- machines eXcelon Dump&Load Utility
Metadata Generation • Explicit metadata • Raw analysis • Semi-automatic augmentation • Comprehensive background • Implicit metadata • Time-stamping images • SGML Editor generates information while editing document type
Standards • Multimedia • ISO 11179 • Metadata Interchange Specification V 1.1 • Meta Content Format (MCF) • Dublin Core Metadata Element Set • FGDC • HL7 • STEP - ANSI (American National Standards Institute)
Standardization Projects • ESA Prototype International Directory • Environmental Data Registry • Basic Semantic Repository • Data Documentation Iniative • Government Information Locator • MÆNAD
More Iniatives • MARC • Dublin Core • Encoded Archival Description (EAD) and the Text Encoding Iniative (TEI) • General International Standard for Archival Description (ISAD(G)) • Visual Resources Association (VRA) Data Standard Committee • Computerized Interchange of Museum Information (CIMI)
Interoperability • Mappings first list • Mappings second list
Related / Group Bodies • NASA/Science Office of Standards and Technology • Metadata Coalition • Global Change Data and Information Systems • X3L8 • X3T2
Characteristics of Dublin Core • Simplicity • Semantic Interoperability • International Consensus • Extensibility • Metadata Modularity on the Web
Dublin Core • Initiated in 1995 at OCLC in Dublin, Ohio • A set of simple 15 information elements that can be used by authors and publishers to describe a wide variety of information resources, including images, on the Web for the purpose of simple cross-disciplinary resource discovery
Dublin Core Elements • Title – the name given to the resource by the creator or publisher • Creator or author – the person or organization primarily responsible for creating the intellectual content of the resource • Subject and keywords – keywords or phrases that describe the content of the resource. Use of controlled vocabularies and formal classification schema is encouraged • Description – textual description of the content of the resource • Publisher – entity responsible for making the resource available • Contributor – anyone else who made significant intellectual contributions to the resource, e.g., editor • Date – date the resource was made available in its present form (YYYY-MM-DD) • Type – e.g., poem, annual report, photograph • Format – data format identifying necessary software and hardware to display or use the resource
Dublin Core Elements • Identifier – unique identifying name of number, e.g., URL or ISBN • Source – unique identifier of source material from which this resource was derived • Language – language(s) of the intellectual content of the resource • Relation – relationship of this resource to other resources, e.g., pages of a book, photographs in a series • Coverage – spatial and or temporal coverage of the resource, e.g., span dates, geographic coordinates • Rights – rights management statement or link to other specification of intellectual property and other rights
Principles of DC Element Set • Extensibility • Core set can be extended by using more specialized metadata • Optionality • All elements are optional • Repeatability • All elements are repeatable
Using Dublin Core on the Web • Authors or publishers of Web resources embed Dublin Core metadata elements as META tags before the body of the document, either by hand or using a tool that generates Dublin Core • META tags must then be collected together into a Web index that can be accessed by a search engine. This is often done using a robot
Dublin Core Example <HTML> <HEAD> <META NAME=“DC.title” CONTENT=“Enduring Paradigm, New Opportunities: The Value of the Archival Perspective in the Digital Environment”> <META NAME=“DC.creator” CONTENT=“Gilliland-Swetland, Anne J.” SCHEME=LCNA”> <META NAME=“DC.publisher” CONTENT=“Council on Library and Information Resources”> <META NAME=“DC.date” CONTENT=“2000”> <META NAME=“DC.type” CONTENT=“Technical Report”> <META NAME=“DC.identifier” CONTENT=“http://www.clir.org/pubs/abstract/pub89abst.html”> <META NAME=“DC.source” CONTENT=“ISBN 1-887334-74-2”> </HEAD> <BODY>
Law Suits Over Metadata • Playboy!