1 / 22

FRBR Work Match activities at DBC

FRBR Work Match activities at DBC. Where are we and where are we going Author: Hans-Henrik Lund Elag 2002 - Roma 17.04.2002 ( hhl@dbc.dk ). What do we have . A record collection of 16,5 mil. marc records from 172 different ’libraries’ Including: the Danish national bibliography 1,4 mil.

mandel
Download Presentation

FRBR Work Match activities at DBC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FRBR Work Match activities at DBC Where are we and where are we going Author: Hans-Henrik Lund Elag 2002 - Roma 17.04.2002 ( hhl@dbc.dk )

  2. What do we have • A record collection of 16,5 mil. marc records from 172 different ’libraries’ • Including: • the Danish national bibliography 1,4 mil. • BNB 1,3 mil. • LC 3,3 mil. • All converted to danMARC2

  3. What do we want • Make this collection available for the end user as a ”work” collection (and not as a collection of records). • We have defined that 2 works are different, if the language or the material type is different.

  4. Example

  5. How do we do this: • We have matched the entire data base on a ”edition/manifestation” level (in clusters). If you want the system to handle orders, its important to maintain edition level. • By making clusters based on manifestation the logical numbers of records was reduced from 16,5 to 12,3 mil. records

  6. From manifestation to work • The result of a search will be matched, on the fly, on work level. (in the test version) • A result of a author search ”Stephen King” yields 362 cataloguing records, 231 manifestation/clusters and 102 works • The benefits of this approach is that we can change the criteria for a ”work” and test it.

  7. The match program • The match program works in two phases • First it makes a key. This key is like a hatch key. The key could be based on the title and/or a known identifier (issn, isbn etc.) • Second it takes two record at a time, with the same key, and compares them according to rules for the match-script

  8. Normalization of the text • København’s freds kommité  KOBENHAVNS FREDS KOMMITE • Hans Krüger  HANS KRYGER

  9. 3 different operands • alike • not_alike • alike_or_missing

  10. Logical fields • A logical field containing data from many subfields • maintitle = 245*a | 239*t | 240*a & 240*d & 240*e & 240*f & 240*h • A logical field containing only parts of a subfield • author = 700*a & 700*h:1 • 100 *a Rifbjerg *h Klaus = 100 *a Rifbjerg *h K.

  11. Conversion of text • 250*a • udg:edition + ed:edition + udgave:edition ... • first:1 + 1st:1 + second:2 + 3rd:3 …. • rev:revised + revideret:revised • 041*a • und: + mul: + mis: • 260*b • Det Schønbergske forlag:schønberg

  12. Edition comparison • We make a temp-field only with words recognized from the edition field (after it has been text converted) • “EDITION” & ( @digit | “REVISED” | “NEW” ) • 250 00 *a 3. ed. *x 12. reprint. = EDITION 3 • 250 00 *a 3. ed.,4. rep. = EDITION 3

  13. Year comparison • 260 *c 1980 = 260 *c 19791982 • 260 *c 19982001 = 260 *c 19992002

  14. Many match-rules

  15. Problems • Different cataloguing praxis • Errors (typing etc) • More than one work in the same marc-record • A CD can contains works from many different artist

  16. Development strategy • The syntax and features of the match-script has been developed along with the project in collaboration between the libarien and the programmer. • The libarien had a online test program of the match-script

  17. The match test program

  18. It depend of the result of this project Perhaps the cost/benefits not good enough Perhaps we actually make a publication database with records stored as works ? The future ?

  19. The End

  20. The script language • An example

  21. Some test results • Boligsikring = 145 manifestations 29 works • Mankell and bøger and dansk = 44 manifestations 19 works • Verdi and opera and cd = 111 manifestations 35 works

  22. Example of key definition

More Related