230 likes | 406 Views
Demystifying metadata. Ann Chapman UKOLN University of Bath.
E N D
Demystifying metadata Ann Chapman UKOLN University of Bath UKOLN is funded by Resource: The Council for Museums, Archives and Libraries, the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath where it is based.
What is metadata? • Structured data about resources • Library catalogues • Abstracting and indexing services • Archival finding aids • Museum documentation • Community information • Carriers: MARC, HTML, SGML, XML
Markup languages • SGML - Standard Generalised Markup Language • - controls document formatting for publication • XML - Extensible Markup Language • - “next generation” SGML • HTML - Hyper Text Markup Language • - SGML subset, controls display of web pages • Tags (usually paired) structure text into elementse.g. headings, paragraphs, lists, etc.<title> </title> <p> </p> <li> </li>
MARC - structure • Structured format • Numeric and alpha tags • Fixed fields • Leader, 001-008, 010-099 • Variable fields
MARC – elements • 1XX Main entry • 2XX Title, SR, edition, publication • 3XX Physical description • 4XX Series • 5XX Notes • 6XX Subject access • 7XX Added entries • 8XX Added entries for series • 9XX References and local fields
ONIX - structure • Carrier - XML • Primary use • publishers to Internet booksellers • rich product information In use • first version 1999 • current version Release 2.0 (2001) • Elements – XML reference name and tag
ONIX - elements • Message header • Product record • identifiers, author, title, edition, language, subject, audience, descriptions, publisher, dates • territorial rights, dimensions, suppliers, availability, promotions • Main series and sub series records
ONIX record • <ISBN> 0123456789 </ISBN> • <DistinctiveTitle> Alice in Wonderland </Distinctive Title> • <Contributor> • <ContributorRole> Author </ContributorRole> <PersonNameInverted> Carroll, Lewis </PersonNameInverted> • </Contributor> • <PublisherName> Collins </PublisherName> • <PublicationDate> 2000 </Publication Date>
Dublin Core - structure • Simple resource discovery • DCMES – Dublin Core Metadata Element Set • HTML the most common ‘carrier’ • Comprises 15 elements with • element qualifiers • element encoding schemes • optional/mandatory elements • Application profiles
Title Creator Subject Description Publisher Contributor Date Resource Type Format Resource Identifier Source Language Relation Coverage Rights Dublin Core - elements
Dublin Core - record • <Title> Alice in Wonderland </Title> • <Creator> Lewis Carroll </Creator> • <Subject> <LCSH> Fiction </LCSH> </Subject> • <Publisher> Project Gutenberg </Publisher> • <Date> 2000 </Date> • <Format> ASCII file via FTP </Format> • <Identifier> http://promo.net/pg/….. </Identifier>
Encoded Archival Description • EAD • 1993 project to develop standard for machine-readable finding aids,Version 1 1998 • SGML (and XML compliant) • Hierarchical structure of archives • repository, management group, fonds, series, file, item • Possible to embed MARC elements
EAD - structure • <ead> • <eadheader> • </eadheader> • <frontmatter> [optional] • </frontmatter> • <archdesc> • <did> • </did> • </archdesc> • </ead>
EAD - elements • <eadheader> [id + bibliographic inf. for finding aid] • <archdesc> [data on a body of archival materials] <did> [container, physical description, physical location, repository, date and title of unit] <admininfo> [biography, scope, access, arrangement] <controlaccess> [name, place, genre, subject, title] • </archdesc>
EAD record - <header> • <ead> • <eadheader> • <eadid> LKX-3042 </eadid • <filedesc> <titlestmt> <titleproper> Pitman Shorthand Collection Catalogue </titleproper> <author> Ann Chapman </author> </titlestmt> <publicationstmt> <date> 1990 </date> <publisher> Bath University Library </publisher> </publicationstmt> • </filedesc> </eadheader>
EAD record - <archdesc> • <archdesc> collection • <did> <abstract> A collection of materials in and about shorthand collected by Sir Isaac Pitman and James Pitman </abstract> </did> • <controlaccess> <subject encodinganalog=“MARC650”> Shorthand </subject> • </controlaccess> • </archdesc> • </ead>
Collection Description • Schema developed May 2000 • Access version for RSLP – summer 2001 • Web version for Reveal – spring 2002 • General attributes • Subject • Dates • Associated agents • External relationships
Coll.Desc. - elements • General: title, identifier, description, strength, physical characteristics, language, type, access control, accrual status, legal status, custodial history, note, location • Subject: concept, object, name, place, time • Dates: accumulation, contents • Agents: creator, owner • Relationships: sub/super collections, catalogues and descriptions, associated collections and publications
Coll. Desc. - record • Title: Pitman Collection • Strength: Shorthand – national collection • Phys. Desc: Printed texts and manuscripts • Lang: English, Spanish, Esperanto, …… • Access: Written request to the Librarian, Bath Univ. • Accrual: passive, deposit • Location: The Library, Bath University, Bath • Subject: Shorthand, Sir Isaac Pitman • Owner: Pitman Publishing Co. • Catalogue: Bath University OPAC
M21 Community Information • Same principles as MARC Bibliographic • Leader individual/organization/program/event/other • Fixed fields 001-008, 010-099 fixed fields 007 disability facilities 008 special aspects • Variable fields
M21 Comm. Inf. - elements • 1XX Name • 2XX Title and Address • 3XX Physical description • 4XX Series (for events) • 5XX Notes • 6XX Subject access • 7XX Added entries • 8XX Other variable fields
M21 Comm. Inf. - record • 110 $a CILIP • 245 $a CILIP HQ • 247 $a LA HQ $f 19?? - 2002 • 270 $a 7 Ridgmount St, London, WC1E 7AE $k 020 7255 0505 $m info@cilip.org.uk $r 9am to 6pm • 311 $a Ewart Room $d seats 50 $g £100 per day • 312 $a Overhead projector $f £10 per day • 581 $a Library + Information Update • 856 $a http://www.cilip.org.uk
Metadata – fit for purpose • MARC Bibliographic • ONIX • Dublin Core • EAD • Collection description • M21 Community Information