150 likes | 266 Views
How to express MARC in XML. ELAG 2004. Workshop 10 Report. Participants:. Liv Aasa Holm , JBI-HIO, Norway; Christer Larsson , The Royal Library, LIBRIS Department, Sweden; Dan Matei , CIMEC - Institute for Cultural Memory, Romania; Anne Munkebyaune , BIBSYS, Norway;
E N D
How to express MARC in XML ELAG 2004. Workshop 10 Report
Participants: • Liv Aasa Holm, JBI-HIO, Norway; • Christer Larsson, The Royal Library, LIBRIS Department, Sweden; • Dan Matei, CIMEC - Institute for Cultural Memory, Romania; • Anne Munkebyaune, BIBSYS, Norway; • Mona-Lise Pedersen, BIBSYS, Norway; • Nils Pharo, Oslo University College, Norway.
Why XML ? • XML is (really) useful ? vs. XML is (just) fashionable ? Useful ! • more flexible syntax, i.e. has more “expressive power”; • it allows more (and finer) syntactic constraints; • it allows (unrestricted) hierarchies in a record; • it is here to stay (?); • a lot of tools available.
Problems with MARCs • not too flexible (too flexible ! [Ole]); • only 2 (or 3 ?) hierarchical levels; • some tags express two things: the nature of the related entity and the kind of relationship with the record; • 1:1 principle not observed (i.e. the records are not “normalized”) – but they are “self-contained” !
Aim: to devise an XML-based bibliographic format • Approach A: “mechanically” express MARC in XML. Already done (by LC): marcxml, see www.loc.gov/standards/marcxml/ • Approach B: consider the synthesis of authority, bibliographic and holdings MARCs in a new format (i.e. a “bibliographic MARC-up language” ?), let’s code-name-it MARCX (not Karl..., not ... Brothers !).
Use cases: • A. internal (database) format; • B. transportation (serialization) format: “transport scenarios”: • to/from union catalogues: “normalized” files, i.e. records of instances of “base” entities + records for their relationships; • for presentation (i.e. display): un-normalized, self-contained (MARC-like) records; • for ... something else (?): records with “FRBR families” of bibliographic objects, e.g. works with their expressions.
“Integrating” framework: FRBR • schema (and/or DTD) including types for: • works; • expressions; • manifestations; • items; • persons; • ... • concepts; • subject headings; • relationships.
Relationships: identifiers • need for unique identifiers within a file; • need for global unique identifiers: • need for large amounts of unique identifiers, i.e. automatic generation; • options: URIs, GUIDs [Global Unique Identifiers].
Relationships: options • reified (Topic Maps like): <relation type=“sometype”> <source>id-s</source> <target>id-t</target> </relation> • within source: <instance ....> ... <relation type=“sometype”>id-t</relation> ... </instance>
Relationships: the “type problem” • the type as attribute: <relation type=“author”>id</relation> vs. • the type as element: <relation> <type>author</type> <target>id</target> </relation> • which is more convenient for “ontology controlled” types ?
Types/elements: inner structure • to conserve the MARC blocks ? No ! • to re-group data elements by their nature, e.g. ‘title’ and ‘notes on title’; • to use as many hierarchical level as necessary (but not more).
Types/elements: general pattern <identifiers> ... </identifiers> <description> ... </description> [<relations> ... </relations>]
“Language independence” (1) • for multilingual records • element: “localized text” (ltext), with attributes: • language; • script; • transliteration standard. • e.g. <ltext lang=“en” script=“latin”> What the hell is going on ? </ltext>
“Language independence” (2) • cataloguing rule: areas in the language of the material: <bibliographicDescription lang=“en”> <title type=“proper”> Romeo and Juliet </title> <title type=“parallel” lang=“fr”> Romeo et Juliette </title> <title type=“translated” lang=“de”> Romeo und Julietta </title> </bibliographicDescription>
Conclusions ? • To tag or not to tag ? To tag ! • In MARCX: • finer (and more controllable) granularity; • less redundancy; • more compact records; • more human-readable records; • lots of ready-made tools. • Another lingua franca ? • “INTERMARC” redivivus !