140 likes | 243 Views
FRBR information exchange. Thomas Hickey & Jenny Toves OCLC Research. Current FRBR information exchange. Sets of MARC-21 records Both bibliographic and authority Sometimes extended pKeys Unique pKeys Lists of sets of control numbers xISBN web service superWork records. Some background.
E N D
FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research
Current FRBR information exchange • Sets of MARC-21 records • Both bibliographic and authority • Sometimes extended • pKeys • Unique pKeys • Lists of sets of control numbers • xISBN web service • superWork records
Some background • Our FRBRization has been done primarily at the work level • We have FRBRized OCLC WorldCat • ~60,000,000 records • ~1,000,000,000 holdings • Used in Open WorldCat, FictionFinder now • Will be visible in FirstSearch displays this fall • Norwegian BIBSYS records • Finish national bibliography (now in WorldCat) • Electronic thesis metadata • Processing done on a 24-node Beowulf Linux cluster
MARC 21 bibliographic data • Basic method of accepting information • Other formats get mapped into it • Fields we use: • Author main entry • Titles • ISBN • Personal name added entries • Language • Extensions • BIBSYS use of 490 fields to indicate hierarchy
MARC 21 Authority data • Map personal names using cross references • Map author-titles using cross references • Fields we currently use • 008 fixed field • 100, 130, 400 • Extensions • Files of additional cross references • Common title patterns • xISBN matching
pKeys • An author-title key for matching • Derived from MARC-like records & authority data ocm00019613 shakespeare, william\1564 1616/hamlet ocm00615676 /hamlet/shakespeare, william\1564 1616 ocm14055779 hamlet motion picture 1948 ocm00290352 /hamlet/ocm00290352
Unique pKeys • pKeys that have been sorted and counted 692 sw00008899 milton, john\1608 1674/poems 691 sw00255854 puccini, giacomo\1858 1924/tosca 690 sw00020874 chaucer, geoffrey\d 1400/canterbury tales 688 sw00237074 melville, herman\1819 1891/moby dick 682 sw03620985 china/laws etc
Lists of control numbers • sw00000089 00206765 01261413 00000089 01236648 03975229 08360541 07363127 • sw00000169 00000169 01647333 00420563 10957239 05205626 02325844 07299473 08244692 08555721 24509677 02533498 03967788 24728032 10130242 04849080 09477230 23323184 22051264 38870301 54266609 56760701 08366329 • sw00000182 00000182 00102731 • sw00000201 00000201 02786659 • sw00000210 00000210 09175561 • sw00000245 00000245 34103639
xISBN web service • Takes an ISBN as input • Returns list of ISBNs in associated work • Significant processing • Starts with control-number list of work-sets • Uses ISBNs to pull work-sets together • Allows fuzzy-matching on author/title • Ends up with consistent clusters • In general larger than those in control-number list
xISBN examples [0130188549, 0130188476]: sw11067396 barnea, amir/agency problems and financial contracting sw13096363 barnea, amir/agency problems on financial contracting [000713407x, 0007126360, 0007134053, 0007134061, 0007126441]: sw48486275 /collins new school dictionary/ocm48486275 sw49740193 /collins new school dictionary/ocm49740193 sw49740203 /collins new school dictionary/ocm49740203
xISBN XML response • <?xml version="1.0" encoding="UTF-8" ?> • - <idlist> • <isbn>000713407x</isbn> • <isbn>0007126360</isbn> • <isbn>0007134053</isbn> • <isbn>0007134061</isbn> • <isbn>0007126441</isbn> • </idlist>
superWorks format • Developed for FictionFinder • XML format • Includes expression-level information • All the information needed • We are adapting it to the Curioser project
superWork record layout • pKey • # manifestations, holdings, sw-id, control #s • publication dates • expressions • expression • classes • language • authors • titles • subjects • components • author, title, publication data
Summary • Simpler when only work-level relationships are needed • Even for work-level relationships, a number of different formats are useful • Information needed for an interface gets much more complicated