240 likes | 245 Views
Learn about metadata, structured data that is encountered in everyday life, and its use in various contexts such as libraries, museums, and online shopping sites. Explore different metadata formats like MARC, ONIX, Dublin Core, and RSLP Collection Description.
E N D
First steps in metadata Ann Chapman Policy and Advice team, UKOLN UKOLN is supported by:
What is metadata? • Structured data about something • Encountered every day • bus & rail timetables • phone directories • Internet shopping sites (e.g. Amazon) • ingredient lists on food items • calendars (public holidays, religious festivals) • event (e.g. seminar, workshop) programme
More about metadata • Structured data about resources • Library catalogues • Abstracting and indexing services • Archival finding aids • Museum documentation • Collection description • Community information • Carriers • Formats (e.g. MARC) • Markup languages (e.g. HTML, SGML, XML)
Markup languages • SGML = Standard Generalised Markup Language - controls document formatting for publication • XML = Extensible Markup Language - “next generation” SGML • HTML = Hyper Text Markup Language - SGML subset, controls display of web pages All use tags (usually paired) to structure text into elements e.g. headings, paragraphs, lists, etc. <title> </title> <p> </p> <li> </li>
Overview • MARC • ONIX • Dublin Core & application profiles • RSLP Collection Description • MARC 21 Community Information • Other metadata types
MARC Formats • MAchine Readable Catalogue records • Library of Congress, 1960s • Now widespread use in many countries • Catalogue once, use record many times • Holdings can be attached • 1960s: books, serials, maps, music scores • 2006: any physical or digital resource
MARC - structure • Structured format and carrier • Numeric and alpha tags • Fixed fields • Leader, 001-008, 010-099 • Variable fields • 100, 110, 111, 245, 260, etc.
MARC - elements • 1XX Main entry • 2XX Title, Statement of Responsibility, edition, publication • 3XX Physical description • 4XX Series information • 5XX Notes • 6XX Subject access • 7XX Added entries (alternative titles, multiple authors, etc.) • 8XX Added entries for series • 9XX References and local use fields
MARC 21 record 021 $a 0761952926 082 $s 338.9 $c 21 100 $a Nederveen Pieterse, Jan P. 245 $a Development theory: $b deconstruction. 260 $a London: $b Sage, $c 2001 300 $a xii, 195p. $c 25cm $e cased 440 $a Theory, culture and society 650 $a Economic development
ONIX Formats • Primary use • Publishers to Internet booksellers • Rich product information • 3 Formats for product information metadata • Books, Serials, Licensing Terms • ONIX for Books in use: • First version 1999 • Current version release 2.0 (2001) • Carrier – XML • Elements – XML reference name and tag
ONIX - elements • Message header • Product record • identifiers, author, title, edition, language, subject, audience, descriptions, publisher, dates • territorial rights, dimensions, suppliers, availability, promotions • Main series and sub-series records
ONIX for Books - record <ISBN> 0123456789 </ISBN> <DistinctiveTitle> Alice in Wonderland <DistinctiveTitle> <Contributor> <ContributorRole> Author <ContributorRole> <PersonNameInverted> Carroll, Lewis </PersonNameInverted> </Contributor> <Publisher> Collins </Publisher> <PublicationDate> 2000 <PublicationDate>
Dublin Core - structure • Simple resource discovery • DCMES – Dublin Core Metadata Element set • HTML the most common ‘carrier’ • Comprises 15 elements with • Element qualifiers • Element encoding schemes • Optional/mandatory elements • Application profiles
Dublin Core - elements Title Format Creator Resource identifier Subject Source Description Language Publisher Relation Contributor Coverage Date Rights Resource Type
Dublin Core - record <title> Alice in Wonderland </title> <creator> Lewis Carroll </creator> <subject><LCSH> Fiction </LCSH></subject> <publisher> Project Gutenberg </publisher> <date> 2000 </date> <format> ASCII file via FTP </format> <identifier> htttp://promo.net/pg/… </identifier>
RSLP Collection Description • Schema developed May 2000 for RSLP programme • MS Access database for RSLP – summer 2001 • Web-based implementations: Revealweb, Cornucopia, Backstage, PADDI, MASC25, SCONE, Cecilia, RASCAL • Based on same model: SCONE • General attributes • Subject • Dates • Associated agents • External relationships
Coll. Desc. - elements General: title, identifier, description, strength, physical characteristics, language, type, access control, accrual status, legal status, custodial history, note, location Subject: concept, object, name, place, time Dates: accumulation, contents Agents: creator, owner Relationships: sub & super-collections, catalogues and descriptions, associated collections and publications
Coll. Desc. - record Title: Pitman Collection Strength: Shorthand – national significance Phys.Char.: printed texts and manuscripts Lang: English, Spanish, Esperanto, …. Access: Written request to the Librarian, University of Bath Accrual: passive, deposit Location: The Library, University of Bath, Bath Subject: shorthand, Sir Isaac Pitman, phonetic alphabets Owner: Pitman Publishing Co. Catalogue: University of Bath Library OPAC
MARC 21 Community Information • Same principles as MARC 21 Bibliographic • Leader • Individual / organization / program / event / other • Fixed fields • 001-008, 010-099 fixed fields • 007 disability facilities • 008 special aspects • Variable fields
M 21 Comm. Inf. – elements 1XX Name 2XX Title and Address 3XX Physical description 4XX Series (for events) 5XX Notes 6XX Subject access 7XX Added entries 8XX Other variable fields
M 21 Comm. Inf. – record 110 $a CILIP 245 $a CILIP HQ 247 $a LA HQ $f 19?? – 2002 270 $a Ridgmount St, London WC1E 7AE $k 020 7255 0505 $m info@cilip.org.uk $r 9am to 6pm 311 $a Ewart Room $d seats 50 $g £100 per day 312 $a Overhead projector $f £10 per day 581 $a Library + Information Update 856 $a http://www.cilip.org.uk
Other metadata formats • IEEE LOM – learning object metadata • EAD – Encoded Archival Description • Theatre Information Group DTD – performance data
Metadata – fit for purpose • MARC 21 Bibliographic– libraries • ONIX – book trade and libraries • Dublin Core – Internet • EAD – archives • Collection description – archives, libraries, museums • M21 Community Information – primarily libraries
Contact details Ann Chapman a.d.chapman@ukoln.ac.uk UKOLN University of Bath, Bath BA2 7AY www.ukoln.ac.uk