440 likes | 678 Views
CIG Conference, University of Strathclyde, Glasgow, 3-5 September 2008. Multilingualism and subject heading languages: how the MACS project is providing multilingual subject access in Europe. Patrice Landry, Head of Indexing and Classification Swiss National Library patrice.landry@nb.admin.ch.
E N D
CIG Conference, University of Strathclyde, Glasgow, 3-5 September 2008 Multilingualism and subject heading languages: how the MACS project is providing multilingual subject access in Europe Patrice Landry, Head of Indexing and Classification Swiss National Library patrice.landry@nb.admin.ch
Overwiew of the presentation • New challenges of subject access • Subject access in a networked environment: issues and initiatives • MACS: • Standards to the rescue & linking manual • What was done recently? • Search interface • Other issues under consideration • Conclusion
New Challenges of subject access: traditional approach of libraries • Most libraries are committed to providing controlled vocabulary access to large collections of printed documents • Standard subject indexing tools are still the best means of providing quality access • Most national libraries and research libraries are using current standard indexing tools
Coping with extended mandate • Libraries are working hard to integrate a great quantity of electronic documents while maintaining their bibliographic access commitment to printed documents collections • Full-text access to electronic collections represents a new challenge for researchers needing in-depth access to documents in all subjects • Pressure toward a “keyword approach” to subject access
How to cope with the old and the new • Libraries will continue to respond to scholars needing in-depth access to documents in all subjects • Libraries will continue to invest in developing new means of ensuring subject access through computer assisted indexing • Libraries are looking at expanding the use of subject access tools through interoperability projects • Establishing interoperability between subject indexing tools may be one of the solutions to maintain or improve access to expanded collections
Subject access issues in networked environments • Many different types of controlled subject vocabularies used for access to resources in various networks (subject headings, thesauri, classification schemes and ontologies) and in different languages • Subject queries across databases or networks limited by heterogeneous language environment • Lack of interoperability between subject indexing tools limits access and use of libraries’ catalogues and databases
Interoperability: some basic approaches • Merging (Integration) of thesauri: UNESCO, TermSciences • Co-occurrence: LCSH/CSH = RVM • Translation: LCSH RVM • Automatic translation: Dandelon • Adaptation / Derivation: RVM RAMEAU ; LCSH FAST • Mapping: between similar languages (MACS)(LCSH-ES) or between different type (OCLC’s LCSH/DDC, CrissCross) • Switching: using an existing language (DDC): HILT or (UDC): MSAC
Categories of subject access interoperability projects and initiatives • Between subject headings: MACS, LCSH-ES, RVM • Between thesauri: Merimee, UMLS Metathesaurus, TermSciences • Between subject headings and classification (UDC): MSAC • Between subject headings and classification (DDC): OCLC’s WebDewey, CrissCross • Between various type of subject indexing tools: HILT
Overview of a multilingual interoperability projects that focus only on subject headings • Not many interoperability projects that are using only subject headings • Linking subject headings is challenging: methodological constraints, linguistic resources and long term commitment need to be resolved • The best known are: MACS, LCSH-ES, RVM-LCSH /CSH
What is MACS? • A project that is developing a system that offers multilingual subject access using current subject heading languages (SHLs) • A project that is based on a coordinated approach between national libraries • A system that will permit users to search library catalogues in the language of their choice
Why the MACS approach? • In 1997, the need to find a « neutral » solution for linking SHLs forced some national libraries to find a solution not based on translation • Approach to add value to existing metadata instead of creating new data (value added data) • Linking work and management outside of each library’s authority files • Info at: http://macs.cenl.org
The British Library (Library of Congress) LCSH English Bibliothèque nationale de France RAMEAU Français Partners 3 SHLs Library of Congress Subject Headings Répertoire d’Autorité-Matière Encyclopédique et Alphabétique Unifié Schlagwortnormdatei / Regeln für den Schlagwortkatalog Deutsche Nationalbibliothek SWD/RSWK Deutsch Swiss National Library project leader SWD/RSWK Deutsch On behalf of CENL Conference of European National Librarians
Basic principles • Equality of languages and SHLs (no pivot) with autonomy of each SHL (only local, MACS is an external link database) • Establishment of equivalences (no translation) between the SHLs involved (no new thesaurus) • Equivalence links conceived as concept clustersMACS = mappings and numeric identifiers • Consistency of results (goal = users retrieval) • Extensible to other SHLs
Subject searching in a MONOLINGUAL environment? Doc. in Italian Doc. in French Doc. in German Doc. in English Indexer German indexer assigns headings in German SH Italian SH German SH French SH English ? ? ? user has to search in German User English
Subject indexing and searching in a MULTILINGUAL environment Doc. in Italian Doc. in French Doc. in German Doc. in English Indexer German indexer assigns headings in German SH Italian SH German SH French SH English user can search in English User English
Milestones • Proposal & Feasibility study (1997-1999) • Prototype development (2000-2001) • Testing & Link Management (LMI) upgrade to production database (2002-April 2004) • New Link Management Interface production database accepted by partners (2005) • New Project Proposal: June 2005 (revised August 2006) • Move to production: adding SWD headings to RAMEAU-LCSH links (2007) (SNL and DNB) • Integration in The European Library : tests in 2007, search interface development in August 2008
Display of links (display is according to the partner’s SHL (source language)
Display of edit function where work on adding or modifying a link is done
All terms in links are authorised headings from authority records
Link strategy • Each partner works from its own SHL (used as source language) • Links to target languages: LCSH or RAMEAU • Already 102’300 RAMEAU-LCSH links (from the RAMEAU authority file, mostly derived from the Quebec Répertoire de vedettes-matière)
Linking Work Using SWD • Work officially started in March 2007 at the SNL (MACS task is part of the indexing workload with annual individual goals) • 0.75 FTE of indexing staff resources used for MACS • Production in the first 12 months: 8’350 (total in the LMI 13’500 – August 2008); production target for 2009: +7’000 links with SWD • SNL has approximately 30’000 SWD topical headings in its Helveticat database – multilingual access to its collection should be completed by 2010 • DNB – In the process of hiring 6 staff members to work on MACS links (work to start this autumn)
MACS linking manual : a necessary condition • A manual for link creation is required • The only existing methodological considerations available are from the final report of the feasibility studies (1999) • Need to adjust the MACS approach in a networked environment (the MACS approach was initially developed in a closed environment – list of terms selected in a few domains in 3 SHLs)
Standards to the rescue • Development of a new standard: BRITISH STANDARD BS 8723-4:2007 Structured vocabularies for information retrieval — Guide. Part 4: Interoperability between vocabularies • Part 4 of the BS deals with all subject heading languages (not limited to thesauri as for ISO 5964) • Also: ANSI/NISO Z39.19-2005 Guidelines for the construction, format and management of monolingual controlled vocabularies (Chapter 10)
Important elements of the BS8723 • Recognizes that linking can be done between linguistically different subject heading languages (SHLs), that have different semantic structures and application principles • Deals with pre or post coordination SHLs • Presents several linking scenarios • Gives directives on creating and validating concordances • Methodology based on the notion of linking from a source language to target languages
Impact of the BS8723, Part 4 on MACS • Validates an approach that was developed without relevant standards support (ISO 5954 was the only one available in 1999) • The MACS approach can be further refined, in particular, the work organisation of each partner library (notion of source language) • Support complex links creation, for example « one to many », « many to one »
Example of the MACS Linking Manual (1) Types of links / levels of equivalence • One-to-one: exact equivalence - Exact equivalence at the linguistic level Theology / Théologie / Theologie - Exact equivalence at the semantic levelSprinting / Kurzstreckenlauf / Course de vitesse - Exact equivalence at the subject headings level (indexing)Track-athletics—Coaches / Leichtathletiktrainer /Athlétisme + Entraîneurs
Example of the MACS linking manual (2) • One-to-two: partial equivalence(semantic level) - Using UF (use for) Coureurs / Runners(Sports) / Laüfer Coureurs / Long distance runners / Langstrekenläufer - According to scope noteSprinting / Kurzstreckenlauf / Course de vitesse Sprinting / Vierhundertmeterlauf /Course de vitesse - Using BT (Broader term) / NT (Narrower term) Jumping/ Sprung / Sauts (athlétisme) Jumping / Hochsprung / Saut en hauteur
MACS linking manual (3) Types of links / levels of equivalences that were not discussed in 1999 • One-to-two: partial equivalences (linguistic level)? One-to-many: partial equivalences (linguistic level)? The LCSH are generally broader (less specific) than SWD and RAMEAU • One-to-many: partial equivalences (semantic level)? For example in the area of music
What was done recently? • Library of Congress agreement to load the LCSH in the LMI • Loading all of the LCSH, RAMEAU and SWD subject headings in the LMI with monthly updating • Improving some functionalities in the LMI (displaying the SHL in prescribed order, features to improve quality control, LMI maintenance and features for TEL) • Search interface prototype developed by TEL (The European Library)
Other issues under consideration? Extension to other subject heading languages • Tests conducted using the Italian “Nuovo Soggettario” in 2007 (new SHL standard in Italy) • Contains about 20’000 headings • Italian is one of the national languages of Switzerland • Tests conducted using about 500 headings in a few domains
Future plans? • Extending MACS to other types of subject headings (geographical, corporate and name headings) • Exploring non-manual linking methods (SKOS) • Extending MACS to more than 4 SHLs
Conclusion • MACS has reached the critical phase of production (links creation and maintenance) • MACS can be expanded to other SHLs (no limits) • Search interface is still a critical issue in MACS • MACS will continue to be a CENL project with international partners (i.e. Library of Congress)
THANK YOUMERCIDANKE Questions?