250 likes | 388 Views
V irtual I nternational A uthority F ile. ALA, June 2006. Virtual International Authority File. Link authority records from national bibliographic agencies Build on their authority work Expand the concept of universal bibliographic control
E N D
VirtualInternationalAuthorityFile ALA, June 2006
Virtual International Authority File • Link authority records from national bibliographic agencies • Build on their authority work • Expand the concept of universal bibliographic control • Allow national or regional variations in authorized form to co-exist • Support needs for variations in preferred language, script, and spelling
Other controlled vocabularies A&I controlled vocabularies (Library) authorityfiles “Ontologies” End-user Semantic Web Building Blocks
Project Goal Demonstrate feasibility of linking personal names across: • Personennormadatei (PND) • Library of Congress Name Authority File (LCNAF)
What is the VIAF? • System • Links between files • Web browser access • Multi-lingual and multi-scripts • Maintenance • National agencies control their records • Records harvested from national systems • Scalable • Any number of national authority files
Matching Variations In the LCNAF and PND authority files: • Same name, same person • Same name, different people • Different names, same person • Missing person in one file
Different Same Name People Two Different People – One Name Adams, Mike • PND: a golfer • LCNAF: author of a Beatles collector's guide
Different Same Person Names One Person – Two Names • LCNAF: Morel, Pierre • PND: Morellus, Petrus
Bibliographic Record Enhanced Authority Derived Authority Authority Record Enhancing the Authorities
Usage Language LC Control Number LC Classification Title Publisher Place of Publication Date of Publication Material Type Authors Mining the Bibliographic Record LDR 00826ccm 2200289 a 4500 1 ocm10025532 5 20031229650847.0 8 840627s1982 nyuuua n eng 10 $a 84758340 40 $a DLC $c DLC 19 $a 17706440 20 $c $2.95 28 22 $a 48418 $b G. Schirmer 45 2 $b d198006 $b d198007 48 $b va01 $b ve01 $a ka01 50 00 $a M1529.3 $b .T 100 1 $a Thomson, Virgil, $d 1896- 245 14 $a The cat : $b duet for soprano and baritone / $c Virgil Thomson ; [words by Jack Larson]. 260 $a New York : $b G. Schirmer, $c c1982. 300 $a 1 score (11 p.) ; $c 31 cm. 500 $a For soprano, baritone, and piano. 650 0 $a Vocal duets with piano. 600 10 $a Larson, Jack $x Musical settings. 700 1 $a Larson, Jack.
All text is normalized Subjects are grouped into broad subject areas Coauthor Publication date is by decade Material type is coded Derived Authority Record 00525nz 2200229n 4500 0 1 xlc 1 1 3 OCoLC 2 5 20040721111415.0 3 8 040721nneanz||abbn n and d 4 40 $a OCoLC $b eng $c OCoLC $f viaf 5 100 1 $a Larson, Jack. 6 903 $a 84758340 7 910 14 $a the cat $b duet for soprano and baritone 8 921 $a g schirmer 9 922 $a nyu 10 930 $a jack larson 11 940 $a eng 12 942 $a 234 13 943 $a 198x 14 944 $a cm 15 950 1 $a thomson, virgil $d 1896
Strong Matching Attributes • A work (title) in common • Common control numbers (ISBN, ISSN, or LCCN) • Exact birth and death year • Joint authors • Name as subject
Weaker Attributes • Only one of birth/death date(s) (allows some variation) • Subject area of works (two levels) • Format (books, films, musical scores, etc.) • Language • Publisher • Partial title match • Date of publication • Country • Role (author, illustrator, composer, etc.) • Format (books, films, musical scores, etc.)
Exact name match with dates Standard Number Corporate name Joint aughor Language Publisher Subject Decade Role Exact title match
LC Names Established Names 4,187,973 Names from Bib Records 3,440,706 Active Established Names 2,556,824 Uncontrolled Names 883,882 Orphaned Names 1,631,149
DDB Names Established Names 2,659,276 Names from Bib Records 2,319,829 Active Established Names 2,013,618 Uncontrolled (Undif’d) Names 306,211 Orphaned Names 645,658
Results • Matches 558,618 • Complex Matches 70,797 • Unique Matches 487,821
VIAF File LC Names 4,187,973 DDB Names 2,659,276 Common 558,618 (70% of potential)
Next Steps • Move to incremental updates • Start harvesting national files • Bring up Web interface (to full files) • Make OAI accessible • Bring in new participants • Handle non-Roman matching • Move to other types of authorities • Corporate names • Geographic names • …
Stage 3: Build OAI Server OAI Server(s) LCNAF DDB/PND
Stage 5: Build End User Interface with Unicode displays User’s cookie specifies Hangul is preferred. Display 700 form, building on local system’s authority structure
Thank you T. Hickey http://errol.oclc.org/laf/n82-54463.html ALA June 2006