500 likes | 640 Views
An Introduction to Metadata and (some) Metadata Standards Making Sense of Metadata, Society of Archivists EAD/Data Exchange SIG London, Thursday 17 November 2005 Pete Johnston Research Officer, UKOLN, University of Bath. UKOLN is supported by:. www.bath.ac.uk.
E N D
An Introduction to Metadata and (some) Metadata Standards Making Sense of Metadata, Society of Archivists EAD/Data Exchange SIG London, Thursday 17 November 2005 Pete JohnstonResearch Officer, UKOLN, University of Bath UKOLN is supported by: www.bath.ac.uk
An Introduction to Metadata and (some) Metadata Standards • Metadata in action: an example • What is metadata? • Some metadata standards • Current issues, challenges
Albums • 76:14 • Accelerator • I'm Ready • Yellow Kid • Artists • (Smog) • A Tribe Called Quest • The Low End Theory • Genres • Acid • Ambient • 0 54 • 12 18 • Playlists • Ambient Selection • Electro Selection • I Think About You • I Think About You (Geiger mix)
Now Playing Herbst Barbara Morgenstern & Robert Lippok Seasons 1 of 4 00:01:38
Simple metadata describing each mp3 file • Track title • Artist name • Album title • Sequence on album • Genre • Length • Sequence in playlist • Used to find, select, organise, access files
"Similar" "Fans" "Tags"
MP3 Shop 1 MP3 Shop 2 ♫ ♫ M M M M M M M M M M M Transfer MBTagger GracenoteCDDB MusicBrainz M M M M ♫ ♫ Transfer Player M Audioscrobbler Last.fm (Other)
The mp3 example • Track metadata obtained from network services • supplied by users • Metadata embedded in mp3 file (ID3) • Extracted/indexed by desktop mp3 player, portable mp3 player • discovery, management • Used in "play" metadata posted to network services • basis for statistics, recommendation services, "collaborative filtering"
The mp3 example • Metadata about different types of resources • Tracks, albums, artists, "plays", people…. • Metadata obtained from various sources • Created by different agents • Metadata moving between different applications/services • Metadata supporting multiple functions • Effective (re)use of metadata • minimal user effort • "making (meta)data work harder" (Lorcan Dempsey)
What is metadata?Some simple definitions • ‘Structured data about data’. • Dublin Core Metadata Initiative FAQ, 2005 • http://dublincore.org/resources/faq/ • Machine-understandable information about Web resources or other things. • Tim Berners-Lee, W3C, 1997 • http://www.w3.org/DesignIssues/Metadata
"Web resources or other things" • Metadata might be "about"… anything! • HTML documents • digital images • databases • books • museum objects • archival records • metadata records • Web sites • collections • services • physical places • people • organisations • “works” • formats • concepts • events
What is metadata?Towards a "functional" view • Data associated with objects which relieves their potential users of having to have full advance knowledge of their existence or characteristics. • Lorcan Dempsey & Rachel Heery, "Metadata: a current view of practice and issues",1998 • http://www.ukoln.ac.uk/metadata/publications/jdmetadata/
What is metadata?Towards a "functional" view • Structured data about resources that can be used to help support a wide range of operations. • Michael Day, "Metadata in a Nutshell", 2001 • http://www.ukoln.ac.uk/metadata/publications/nutshell/
What might metadata "say"? What is this called? What is this about? Who made this? When was this made? Where do I get (a copy of) this? When does this expire? What format does this use? Who is this intended for? What does this cost? Can I copy this? Can I modify this? What are the component parts of this? What else refers to this? What did "users" think of this? (etc!)
What operations/functions? • resource disclosure & discovery • resource retrieval, use • resource management, including preservation • verification of authenticity • intellectual property rights management • commerce • content-rating • authentication and authorisation • personalisation and localisation of services • (etc!)
What operations/functions? • Different functions : different metadata • Metadata (and metadata standards) sometimes classified according to function • Descriptive: primarily for discovery, retrieval • Administrative: primarily for management • Structural: relationships between component parts of resources • Contextual: relationships between resources • No “one size fits all solution”!
Creator = J Smith Date = 2001-11-05 Title = Report Resource1 Where is metadata? Metadata embedded in resource e.g. ID3 metadata in MP3; meta elements in HTML docs; TEI header; summary properties in word processor docs; IPTC, EXIF data in image formats Can resource support embedding of metadata? Does metadata creator have write access to resource? Can metadata consumer extract embedded metadata? What happens when resource deleted? Metadata about aggregates of resources? Metadata about people, places, concepts?
Creator = J Smith Date = 2001-11-05 Title = Report Metadata rec 1 Metadata rec = 1 Resource1 Where is metadata? Metadata record as separate object Record identifier embedded in resource e.g. link rel="meta" elements in HTML docs Metadata record may be remote from resource Can resource support embedding of link? Does metadata creator have write access to resource? Can metadata consumer extract link to metadata record? What happens when resource deleted? Metadata about aggregates of resources? Metadata about people, places, concepts?
Doc = 1 Creator = J Smith Date = 2001-11-05 Title = Report Metadata rec 1 Resource1 Where is metadata? Metadata record as separate object Resource identifier in metadata record e.g. (lots!) Metadata record may be remote from resource Does not require embedding of metadata or link Does not require metadata creator to have write access to resource Metadata record created independently of resource – possibly multiple records Metadata consumer uses metadata records independently of resource Metadata record may persist after resource deleted Metadata record can describe anything (with identifier…)
Metadata as managed resource • Metadata • may be used independently of resource • may grow/change independently of resource • may be used in different subsets, multiple formats • may be the subject of metadata! • requires management • Metadata typically stored in some form of database, repository • Exposed/exported as required
Doc Creator Date Title 1 J Smith 2001-11-05 Report Metadata as managed resource
Who/what creates metadata? • Information professionals ("cataloguers") • Resource creators • Resource managers • Resource distributors/publishers • Indexing/abstracting services (and similar) • Resource users • Software applications • Probably others I've forgotten…
User-created metadata • Growing interest in user-created metadata • user annotation, ratings, comments, "reviews" • e.g. Amazon, OCLC OpenWorldCat • "tagging", folksonomy • e.g. Flickr, del.icio.us • Capture user perceptions of resources • Capture user knowledge of resources • Questions of authority, accuracy, trust, etc
Application-captured/generated metadata • Human metadata creation costs time/effort/money • "experts" cost even more! • Software applications can obtain metadata from • operating system, Web server etc • size, MIME types etc • resource itself • email headers etc • metadata created by authoring applications (e.g. MS Word) • automated analysis of resource content (e.g. citation analysis, keyword extraction, automated classification) • usage records, transaction logs • e.g. people who bought/used/played this also bought these • "joining up" metadata from different sources
Metadata standards • Typically defined by "resource management communities" • Different traditions, perspectives, functional requirements • Typically comprise • A "conceptual model" (sometimes not explicit) • A set of named components ("terms", "elements" etc) and documentation on their meaning and use • A specification of how to represent a metadata instance in a digital format (binding)
Bibliographic Metadata standards • Machine-Readable Catalogue (MARC) • primary library cataloguing standard • supports discovery and management of library resources • maintained by Library of Congress • Metadata Object Description Schema (MODS) • represents subset of MARC • XML Schema • maintained by Library of Congress • ONIX • information provided by publishers to retailers • some use of ONIX to enhance library catalogue records • maintained by EDItEUR/Book Industry Communication
Archival/Records Management Metadata standards • ISAD(G) • not in itself machine-processable? • but used as basis of database schemas in e.g. CALM • Encoded Archival Description (EAD) • metadata about archival records (and aggregations of records) • may include some metadata about organisations, individuals • Encoded Archival Context (EAC) • metadata about organisations, individuals • Records Management Metadata e.g. • National Archives ERMS Metadata Standard
Museum Metadata standards • SPECTRUM • Museum documentation standard • Describes • Procedures • Information requirements ("units of information") • Metadata about objects, events, agents etc • CIMI XML Schema for SPECTRUM • Maintained by mda
Image Metadata standards • VRA Core • "works of visual culture as well as the images that document them" • Image as visual representation of Work • maintained by Visual Resources Association • NISO Data Dictionary of Technical Metadata for Digital Still Images • To facilitate technical interoperability, also management curation/preservation • Encoded/serialised using MIX XML Schema
Government Metadata standards • UK e-Government Metadata Standard • based on Dublin Core • also incorporates components from NA ERMS • specifies constraints on values e.g. Integrated Public Sector Vocabulary • primarily to support resource discovery, retrieval/access, some records management • eGMS v3.0 provides large set of terms • in practice, deployed in subsets
Learning Metadata standards • IEEE Learning Object Metadata (LOM) • To support the disclosure/discovery and use/reuse of "learning objects" • UK LOM Core as "application profile" of LOM • IMS Specifications • Learner Information Profile (people) • Learning Design (learning activities etc) • Enterprise (groups/classes etc) • Resource List Interoperability (reading lists etc) • etc!
Multimedia Metadata standards • MPEG-7 • to describe the content of audio-video streams • "making audio-visual material as searchable as text" • designed to be incorporated into the production process • create metadata at various stages • extensible through the use of a Description Definition Language (DDL) • metadata may be embedded in resource or located separately
Metadata standards & interoperability • Standardisation (mainly) within communities/domains… • … but on the Web • resources/metadata moving between/across "communities" • services operating on metadata from multiple "communities"
Metadata standards & interoperability • How to minimise costly, complex, lossy mappings/translations? • The "railroad gauge dilemna" • (Stuart Weibel, "Border Crossings", D-Lib, Jul 2005) • How to maximise effective reuse of existing metadata? • How to realise aspirations to extensibility, modularity? • Does the W3C's Resource Description Framework (RDF) offer a solution?
Summary • Metadata is used almost everywhere • Metadata enables people and software applications to do things • Not only about "discovery" • Different functions require different metadata • Metadata creation is potentially costly • Clarify functional requirements • Exploit existing sources • Many metadata standards established/emerging • But challenges remain in working across standards, using standards in combination
An Introduction to Metadata and (some) Metadata Standards Making Sense of Metadata, Society of Archivists EAD/Data Exchange SIG London, Thursday 17 November 2005 Pete JohnstonResearch Officer, UKOLN, University of Bath UKOLN is supported by: www.bath.ac.uk