340 likes | 516 Views
Marie Bal?kov? Czech National Library. 2. introduction. the aim is to provide users with an authorized indexing and retrieval tool for multilingual subject searching in online environmentthe initiative is complying with the main goals currently defined by IFLA for the activity of Indexing and Class
E N D
1. Multilingual Subject Access to Catalogues of National Libraries (MSAC) Czech Republic’s collaboration with Croatia, Latvia, Lithuania, Macedonia, Slovakia, Slovenia Marie Balíková
National Library of the Czech Republic
Marie.Balikova@nkp.cz
2. Marie Balíková Czech National Library 2 introduction the aim is to provide users with an authorized indexing and retrieval tool for multilingual subject searching in online environment
the initiative is complying with the main goals currently defined by IFLA for the activity of Indexing and Classification Section:
Changing Roles of Subject Access Tools (Berlin)
Implementation and Adaptation of Global Tools for Subject Access to Local Needs (Buenos Aires)
Cataloguing and Subject Tools for Global Access: International Partnerships (Oslo)
3. Marie Balíková Czech National Library 3 CZENAS - MSAC
Czech National Subject Authority File/CZENAS
cooperative venture of three large libraries in Czechia:
National Library of the Czech Republic
Moravian Library in Brno
Research Library in Olomouc
Multilingual Subject Access to Catalogues of National Libraries/MSAC
joined initiative of seven national libraries:
National and University Library, Zagreb, Croatia
National Library of the Czech Republic, Prague
National Library of Latvia, Riga
Martynas Mazvydas National Library of Lithuania, Vilnius
National and University Library St. Kliment Ohridski, Skopje, Macedonia
Slovak National Library in Martin, Slovakia
National and University Library, Ljubljana, Slovenia
4. Marie Balíková Czech National Library 4 factors affecting subject indexing the standardization of subject retrieval process and indexing and classification tools which
minimizes duplication of work in sharing information
supports shared cataloguing process at national and international level
the possibility of interoperability among different indexing and classification schemes which consists in
intellectual mapping between terms in different controlled vocabularies
using a switching language as an intermediary for moving among equivalent terms in different vocabularies, above all multilingual
the possibility to increase precision and recall trough Z39.50 protocol and its profiles and to apply authority control whenever possible – in all databases searched through, introducing the same subject search criteria both in remote and local databases
5. Marie Balíková Czech National Library 5 multilingualism issue in online environment is a complex issue
users may want
to search a multilingual collection by using queries in one language or
to retrieve documents in a number of specific languages
to prefer an interface in the language of their choice
solution: the users are provided with the language support they need
possible limits:
technologies
language skills of the staff
financial means
Therefore, there have been only few attempts to create a multilingual subject access tool or to integrate already existing library systems in the area of multilingual subject access
6. Marie Balíková Czech National Library 6 subject analysis process in online environment
to prefer post-coordinated indexing system
to simplify application syntax in subject headings strings
to support conceptual compatibility of indexing formulas/preferred terms used in various indexing languages
to support harmonisation between various indexing languages
to support mapping between verbal terms and equivalent notations of classification scheme
to improve subject access for OPACs and for Web resources
7. Marie Balíková Czech National Library 7 UDC classification system in on-line environment can enhance subject access, because it
provides context to search terms
covers all subjects
improves subject access to large databases using sophisticated methods
enables language independent notations to be linked to search terms of various verbal languages
enables other languages to be joined later without the need to classify the resources again
could serve as switching language, mapping mediator which ensures convertibility between information languages
supports very detailed expressions of complex subjects using a variety of common and special auxiliaries, specific symbols and punctuation
is flexible more than other universal classification schemes
indicates entities which occur in more than one domain (class)
8. Marie Balíková Czech National Library 8 examples Heading water
UDC 546.212 (inorganic chemistry)
UDC 556-032.2 (hydrology)
UDC 628.1.03 (water management)
Heading incest
UDC 316.835.2 (sociology)
UDC 343.542.5 (criminal law)
UDC 616.89-008.442.38 (psychiatry)
9. Marie Balíková Czech National Library 9 MSAC and UDC UDC system proved to be the most suitable for creation of a multilingual common indexing tool
all the participating libraries used it, even if in different versions
in MSAC is applied as an enumerative classification, functionality very similar to that of DDC
UDC numbers – single and complex (pre-combined) are treated as single numbers
present revisions of UDC - more faceted structure
frequent need to use combination of numbers like 821 Literature and 94 History
number 821 for literature has to be combined with the common auxiliary for language, e.g. 821.162.3 Czech literature
class number captions (descriptions) added to the retrieval system and available for search in the end-user interface – most effective and user-friendly
in MSAC system UDC class numbers are used alongside their descriptions
10. Marie Balíková Czech National Library 10 examples 602.44 -- biotransformation / biotransformace
602.6 -- gene engineering / genové inženýrství
602.6 -- genetic engineering / genetické inženýrství
602.6 -- transgenosis / transgenoze
602.641 -- viral vectors / virové vektory
602.7 -- cloning / klonování
604.4 -- secondary metabolites / sekundární metabolity
604.6 -- genetically modified organisms / geneticky modifikované organismy
608.1 -- bioethics / bioetika
608.3 -- biological safety / biologická bezpecnost
11. Marie Balíková Czech National Library 11 citation order / UDC MRF in electronic form Citation order
UDC facility to adapt the citation order to fit in with local requirements
international exchange of information demands consistency in building UDC class numbers
the same citation order should be adopted
UDC MRF in electronic form
national language versions of UDC MRF in electronic form have not been prepared yet
the language equivalents of controlled terms created by participating libraries are being added to the Czech Subject Authority file
12. Marie Balíková Czech National Library 12 Czech National Subject Authority File - CZENAS integrated indexing and retrieval tool in which verbal controlled terms are being linked to UDC equivalent notations
respecting IFLA recommendation - to consider possible relationships between subject authority records and classification
respecting LC practice
topical authority file - thesaurus in which following kinds of relationships between terms are defined:
equivalence (expressed: USE)
hierarchy (expressed: BT-Broader term; NT-Narrower term)
association (expressed: RT-Related term)
Czech authority file of topical terms - base for multilingual controlled vocabulary
13. Marie Balíková Czech National Library 13 formats MSAC supports both UNIMARC and MARC 21
UNIMARC: Croatia, Lithuania
Comarc (based on UNIMARC): Slovenia, Macedonia
MARC 21: Latvia, Slovakia, Czechia
intention - to respect MARC formats as much as possible, but in view of specific needs identified, some extensions and corrections have to be introduced
fields for entering combinations of language variants and UDC notations extended by
subfield “b” (UDC equivalent notation)
subfield “c” (UDC qualifier)
UNIMARC - tag 450: subfields a, b, c
MARC 21 - tag 750: subfields a, b, c
MARC 21 Format for Authority had to be extended by special field 089 for entering UDC number
14. Marie Balíková Czech National Library 14 English equivalents/approval process English equivalents of preferred terms, mostly LCSH terms are being chosen
If LCSH equivalents are not found (LC terms being too broad), the reference sources like LC titles and subtitles file, encyclopedias, manuals, language vocabularies, www pages, full text databases are consulted.
Approval process:
The proposals of preferred terms linked to the UDC class numbers and English equivalents are being sent to the editorial staff for approval, then the approved authority records are entered via special programme procedure into the authority database.
15. Marie Balíková Czech National Library 15 mapping process is done intellectually
consists in establishing equivalents between the subject controlled terms used in indexing systems of participating libraries through a switching language
switching language: UDC notations based on UDC MRF and English equivalents
mapping links are defined between preferred terms represented by isolated lexical units only
subject headings strings as a whole are excluded, are not mapped
authority records as a whole are excluded, are not mapped
links are established only between topical main headings (main entries), UDC numbers and language equivalents
16. Marie Balíková Czech National Library 16 combination of verbal expressions – UDC notations simple combination
one verbal expression is mapped to one simple UDC notation
painting / malírství – UDC 75
one verbal expression is mapped to one compound/complex UDC notation
medical law / medicínské právo – UDC 34:61
history of law / právní dejiny – UDC 34(091)
Anglo-American law / angloamerické právo – UDC 34(410+73)
complex combination
one verbal expression is mapped to multiple UDC notations
death/smrt – UDC equivalent 128 (metaphysics)
death/smrt – UDC equivalent 2-186 (theological anthropology)
death/smrt – UDC equivalent 233-186 (Hinduism)
death/smrt – UDC equivalent 393 (ethnography)
death/smrt – UDC equivalent 616-036.88 (medicine)
one UDC notation is mapped to multiple verbal expressions
34 -- law / právo * laws / zákony* legal aspects / právní aspekty * legal regulations / právní predpisy
17. Marie Balíková Czech National Library 17 MSAC indexes
Topical terms – Multilingual
Topical terms – Czech
Topical terms – English
Topical terms – Croatian
Topical terms – Latvian
Topical terms – Lithuanian
Topical terms – Macedonian
Topical terms – Slovak
Topical terms – Slovenian
UDC
Subject fields: Astronomy, Demography, Law,
Politics, Sociology, Sport, Theater
18. Marie Balíková Czech National Library 18
19. Marie Balíková Czech National Library 19
20. Marie Balíková Czech National Library 20
21. Marie Balíková Czech National Library 21
22. Marie Balíková Czech National Library 22 MSAC – two phases phase 1:
development of Czech topical authority file
integration of language variants of participating libraries in Czech subject authority file
phase 2:
combinations of UDC-natural languages and English expressions to be inserted into the special fields of respective bibliographic records of cooperating libraries
process: semiautomatic - intellectual checking of data
access via Z39.50 protocol or
small testing database (created at the NL CR)
after accomplishing the procedure of authorization and authentication users are offered
access via one single interface in the UIG
both Czech and English interfaces and both Czech and English languages for searching
23. Marie Balíková Czech National Library 23 Uniform Information Gateway (UIG) allows uniform and easy access to both traditional and electronic resources (local and remote)
developed by Czech National Library and Charles University
offers extended services feature (SFX) - navigation from the source to other related targets is possible
basic SW: MetaLib and SFX
MetaLib - parallel browser
enables to search catalogues, full texts, databases and archives
is not limited to any predefined interfaces
uses Z39.50 for communication
SFX system is a context-sensitive linking between Web resources providing and coordinating cooperation between resources and targets
RESOURCE - entity through which we have just made a search
TARGET - entity where the service is being provided
OpenURL is a mechanism that makes open linking in the Web-based information environment possible
24. Marie Balíková Czech National Library 24 MSAC and retrieval process in UIG
Metalib
enables rephrasing of queries into a format that is appropriate for the resource selected
sends the queries and receives answers (results)
transforms them into its own format and output them
offers and performs deduplication of selected documents
enables personalized elements - My Resource List of selected databases: Czech Authority Database and those of cooperating libraries (since October 2005)
25. Marie Balíková Czech National Library 25 MSAC : aplication in UIG ?????????????
26. Marie Balíková Czech National Library 26
27. Marie Balíková Czech National Library 27
28. Marie Balíková Czech National Library 28
29. Marie Balíková Czech National Library 29
30. Marie Balíková Czech National Library 30
31. Marie Balíková Czech National Library 31 future development idea to create a multilingual subject retrieval tool or to introduce a mapping scheme in existing systems is considered as an essential element of The European Library service
MSAC project – beginning phase
problems:
only voluntary work of teams of participating libraries
communication almost only via e-mails
no external financial support
new perspective:
joining the TEL-ME-MOR project (The European Library: Modular Extensions for Mediating Online Resources) funded by the European Commission under the Sixth Framework Programme of the Information Society Technologies (IST) Programme, where the ten new member states of European Union have been invited
integration with MACS project ? J
32. Marie Balíková Czech National Library 32 Multilingual Access to Subjects (MACS) project goal - to integrate the most developed and used subject indexing systems LCSH, RAMEAU and SWD
feasibility of linking mentioned Subject Heading Languages was investigated
the approach by creating links between LCSH, RAMEAU and the SWD/RWSK was tested in the fields of sport and theatre
a prototype was created
the ways how to extend the use of MACS project has been discussed
crossing the language barrier
adding new subject indexing systems
or investigating the use of other tools such as classifications if the same are available across several institutions
demanding significant resources
33. Marie Balíková Czech National Library 33 comparison of MACS and MSAC MACS - fully functional prototype :
MSAC - first stage of a multilingual initiative
MACS - linking existing verbal Subject Heading Languages :
MSAC - creating a multilingual retrieval system based on UDC
MACS - all SHLs of the equal status, no pivot language :
MSAC - switching language UDC and English equivalents
MACS and MSAC
common nouns (in MSAC the special name entities like Washington Declaration, 1918, October 18th)
only headings mapped as equivalent headings judged to be synonymous in meaning
only preferred forms mapped, hierarchical structures and thesaural relationships not mapped
syntactical structures of subject headings strings not mapped
34. Marie Balíková Czech National Library 34 Multilingual subject access A challenge
Thank you for your attention J
MSAC: http://sigma.nkp.cz/eng/auv
CZENAS: http://sigma.nkp.cz/eng/aut
JIB: http://www.jib.cz/