240 likes | 385 Views
Cataloguing the Romanian Cultural Heritage or yet another schema for heritage assets . Dan Matei (CIMEC). A new schema: why ?. (CIMEC) the difficulty to manage many databases with overlapping content; limitations of MARC formats and the non-standard (but too simple) museum matadata;
E N D
Cataloguing the Romanian Cultural Heritage or yet another schema for heritage assets Dan Matei (CIMEC) ELAG 2006. Matei: PML
A new schema: why ? • (CIMEC) the difficulty to manage many databases with overlapping content; • limitations of MARC formats and the non-standard (but too simple) museum matadata; • new insights via FRBR and CRM; • the need for a data model for the (future) Romanian Shared Catalogue. ELAG 2006. Matei: PML
Why MARC is not enough ? ELAG 2004 (Trondheim) WS 10: • the "1 to 1 principle" is not observed (i.e. matadata about about several resources in the same record); • it is not too flexible, i.e. it is almost flat, it allows only 2 (let's say 3) hierarchical levels; no good control of the granularity of data; • some tags (e.g. those for the headings) express two different things: a) the nature of the related resource, b) the kind of relationship; • it does not allow (naturally) multilingual data within a record (for the fields with values in the language of the cataloguing). ELAG 2006. Matei: PML
Functions of the catalogue FRBR: the Frankfurt Principles (2003) [the Buenos Aires version (2004)] wording: ... to enable a user: • to find bibliographic resources in a collection (real or virtual) as the result of a search using attributes or relationships of the resources: • to locate a single resource • to locate sets of resources • to identify a bibliographic resource or agent; • to select a bibliographic resource that is appropriate to the user’s needs; • to acquire or obtain access to an item described; • to navigate a catalogue. ELAG 2006. Matei: PML
(Extra) functional requirements for the shared catalogues • FR1: language neutrality, i.e. the textual elements could be expressed in several languages, and the language of en element could be detected automatically; • FR2: traceability of changes, i.e. the modifications could be tracked, dated and attributed (thus, reversed); • FR3: opinion neutrality, i.e. different opinions could coexist in the metadata, that is the elements could have alternative values, with clearly assigned intellectual responsibilities. ELAG 2006. Matei: PML
PML = Panizzi Markup Language sir Anthony Panizzi (1797-1879) • chief librarian of British Museum (1856 – 1867); • the famous 91 cataloguing rules (1839). ELAG 2006. Matei: PML
Other XML-based formats • marcxml (LC) – 2003; • MODS : Metadata Object Description Schema (LC) – 2005; • MADS : Metadata Authority Description Schema (LC) – 2005; • BiblioML (French Ministry of Culture) – 1999; • rdfs:frbr (Stefan Gradmann) – 2005. ELAG 2006. Matei: PML
PML: "design principles" • P1: a data model based on FRBR & CRM, i.e. accommodating library and museum resources; • P2: to observe the three (extra) functional requirements for the shared catalogues; • P3: to enhance the (lexicographic and chronologic) browsing of the access points; • P4: to make the simple easy and the complex possible (corollary: to accommodate a scalable granularity of data); • P5: descriptions could include "elements not required for the stated objectives" (i.e. only half of Svenonius' "Principle of sufficiency and necessity"). ELAG 2006. Matei: PML
two (contradictory) ambitions • to have specific elements for the frequent resources (e.g. books, articles, paintings, coins), but also generic ones, for the many, less frequent types of resources (e.g. artifact) – a new element is imposed by the specific mixture of the resource's properties; • to come up with an elegant language (i.e. with economy of means). a) much easier than b) ! ELAG 2006. Matei: PML
PML: outline (the catalog) <catalog ...> <records> <book guid="g1"...> <coin guid="g2" ...> ... </records> <cataloguers> <cataloguer guid="g3"...> ... </cataloguers> <archive> <replacedElement replacedBy="g4"...> ... </archive> </catalog> ELAG 2006. Matei: PML
PML: outline (the vocabulary) <vocabulary> <vocabularyClass name="languages"> <term canonical="English"> <version languageRef="Romanian">engleză</version> <version languageRef="English">English</version> </term> <term canonical="Romanian"> <version languageRef="Romanian">română</version> <version languageRef="English">Romanian</version> </term> ...... </vocabularyClass> ..... </vocabulary> ELAG 2006. Matei: PML
PML: a sample <catalog> <records> <book cataloguerId="dm" guid="b001" timestamp="2006-03-11"> <titlePage> <responsibility>Gellu Naum</responsibility><br/> <title>Zenobia</title><br/> <publisher>Humanitas</publisher><br/> <publishingPlace>Bucureşti</publishingPlace> </titlePage> <publication> <place><statement>Bucureşti</statement></place> </publication> <ISBN-10><number>973-50-0324-4</number></ISBN-10> <language><languageRef>Romanian</languageRef></language> <responsibility main="true" doubtful="false"> <targetId>p1</targetId><typeRef>author</typeRef> </responsibility> </book> ELAG 2006. Matei: PML
PML: a sample (cont.) <person guid="p1" timestamp="2006" cataloguerId="dm"> <appelation><signature> <name typeRef="real name" guid="gn"> <segment guid="g" classRef="first name">Gellu</segment> <segment guid="n" classRef="last name">Naum</segment> </name> <version languageRef="English"> <qualifier> <segment>Romanian poet and playwright</segment> </qualifier> </version> <dates guid="t" type="life"> <segment guid="y">1915</segment>-2001 </dates> </signature> <indexEntry> <alphaKey1 ref="n"/><alphaKey2 ref="g"/><dateKey2 ref="y"/> </indexEntry> </appelation> </person> ELAG 2006. Matei: PML
PML: updates <note guid="111"> <version languageRef="French"> <update cataloguerId="dm" timestamp="2006-04-21"> <deleted>ancien</deleted> <inserted>nouveau</inserted> </update>en francais </version> <version languageRef="English"> in English </version> </note> ELAG 2006. Matei: PML
Abstractions ELAG 2006. Matei: PML
Items ELAG 2006. Matei: PML
The index: problems (1) ELAG 2006. Matei: PML
The index: problems (2) ELAG 2006. Matei: PML
The index: problems (3) ELAG 2006. Matei: PML
The index: keys • alphaKey1 • alphaKey1Type • numKey1 • dateKey1 • dateKey1Precision • alphaKey2 • numKey2 • dateKey2 • dateKey2Precision ELAG 2006. Matei: PML
The index: keys types/ranks ELAG 2006. Matei: PML
The index: date precision ELAG 2006. Matei: PML
PML-based database: a suggestion Tables: • resources: • guid, • XML document; • relations; • guid, • sourceId, • targetId, • XML document; • index: • keys ELAG 2006. Matei: PML
Doubts and open problems • how to handle multiple views (interfaces) ? • how to handle an "original", i.e. an object which is work, expression, manifestation and item (e.g. Mona Lisa) ? • how to handle a concept which is also an UDC class (e.g. 'hysteria') ? • it is a sound approach ? ELAG 2006. Matei: PML