380 likes | 459 Views
Library 1.1. Kristin Antelman Charley Pennell NCSU Libraries North Carolina State University ALCTS/CCS Cataloging Norms Discussion Group ALA Annual, Washington, DC 23 June 2007. The state of the catalog 2007.
E N D
Library 1.1 Kristin Antelman Charley Pennell NCSU Libraries North Carolina State University ALCTS/CCS Cataloging Norms Discussion Group ALA Annual, Washington, DC 23 June 2007
The state of the catalog 2007 • Libraries have a considerable investment in legacy bibliographic metadata created under various content standards (local rules, ALA 1949, AACR, AACR2, AACR2rev.) • RDA (1997- ) is slow in coming, and has been taking increasing heat as it tries to satisfy perceived needs in multiple user communities • New community-of-interest-based content standards are emerging to replace AACRx (DACS, CCO, DCRM, CSDGM) • Our current integrated library systems are basically maxed-out inventory management systems with a veneer of public service functionality and little interoperability with the emerging Web 2.0 world
The state of the catalog 2007 • The cult of MARC, which has served us well for almost 40 years, is keeping us from moving ahead • An increasing percentage of core library work is being done in applications outside of the ILS which are not bound by its limitations, e.g. ILL, ERMS, full-text & database searching, collection development • Library users’ search expectations have been conditioned by interactions with commercial Websites, with which Libraries can barely afford to compete, but must • Libraries are becoming increasingly virtual as users interact with us online (e-resources, Second Life)
Endeca at NCSU Libraries • Went live in January 2006 • Works with a text version of a daily snapshot of Libraries’ MARC & other metadata • Used to improve the discovery portion of the library catalog • Interoperates with ILS for holdings, current availability status • Web2 interface still present for known item & authority searching
Endeca features • Commercial-strength search/sort speeds • Site customizable relevance ranking • Faceted browse • True browsing (LC classification) • Spell-checking • ”Did you mean?” • Automatic word stemming
Endeca hierarchies • Classification browse • Format/Item type • Geographic names • Chronological periods
Classification browse • Uses Library of Congress classification outline to filter search results by call number • Have added LC call numbers to most of our e-book and e-journal records so these are not lost from browse • Lost from call number browse: SuDocs, Microforms, Archives/Manuscripts, collections with accession #
Future “catalog” development • Enhance local catalogs • Enable new uses on the web • Properties of the semantic web • Frameworks for content • Frameworks for services
Enhance local catalogs • improvements to catalog functionality: e.g, bridge keyword and authority searching • social networking • data feeds from catalog into other platforms
Enable new uses on the web • Human vs. machine interpretation • the html web (and our data!)semantics for readers; syntactic (hand-coded) linkages vs. • the semantic websemantics for computers;derived “smart” linkages
Principles of the semantic web 1. Everything can be identified by URI's 2. Links explicitly identify relationships (e.g., "has subject”) 3. Partial information is tolerated 4. Evolution is supported 5. Minimalist design. Standardize no more than is necessary 6. Missing doesn’t mean broken Koivunen and Miller (2001), Talis
Frameworks for content • Subject content • Resource Description Framework
Frameworks for content • Subject content • Resource Description Framework • Simple Knowledge Organization Systems
Frameworks for content • Subject content • Resource Description Framework • Simple Knowledge Organization Systems • Ontology Web Language
Frameworks for content • Subject content • Resource Description Framework • Simple Knowledge Organization Systems • Ontology Web Language • Names • Friend Of AFriend
Frameworks for services • web services, or • service oriented architecture What is it?
RDA/DCMI agreement • (May 2007) Commitment to work together to: • develop an RDA Element Vocabulary • expose RDA Value Vocabularies • develop an RDA Application Profile, based on FRBR and FRAD Separate elements from instructions Make definitions, relationships explicit documents community understanding guidance for crosswalks, tools specifies controlled vocabularies and encoding
Stefan Gradmann, “rdfs:frbr--Towards an Implementation Model for Library Catalogs Using Semantic Web Technology,”CCQ 39:3/4 (2005)
exp:756 creator:123 Simplified representation has manifestation War and Peace has title man:008 has expression work:867-87 has creator is known as Leo Tolstoy is known as Лев Толстои http://www.concepts.org/work/ http://www.concepts.org/work/867-87 http://www.concepts.org/expression http://www.concepts.org/expression/756 http://www.concepts.org/manifestation http://www.concepts.org/manifestation/008 http://www.concepts.org/creator/ http://www.concepts.org/creator/123
What will our “catalogs” be then? • The catalog (semantic web) should: • recognize clusters of knowledge • show lineage of publications, authors • make previously unknown connections visible • show authoritativeness of sources • show popularity/use Timothy Burke (Bibliographic Control Working Group, March 2007)
What could this mean? What should this mean? • Others will remix • Librarians will add value
Hurdles to Library 2.0 • Data • Technical • Economic • Cultural
Hurdles: Data • Historic catalog data has fewer, often more general (less specific) subject headings • Catalogers inconsistent in application of subject headings • Our subject tools (LCC/LCSH) not hierarchical • MARC data not granular enough (5xx fields) • Useful bib data not in controlled form • Other humans can read our data, other machines cannot
Hurdles: Data • 260: Publication, distribution, etc. • 260 $aWashington, DC :$bThe Society,$c1982- • 260 $aNew York :$bWiley,$cc2005. • 505: Contents note- not structured for use • 505 00 |g1.$tEmerging Technologies in Surfactant-Enhanced Subsurface Remediation /$rDavid A. Sabatini, Robert C. Knox and Jeffrey H. Harwell --$g2.$tImpact of Surfactant Flushing on the Solubilization and Mobilization of Dense Nonaqueous-Phase Liquids /$rL. M. Abriola, K. D. Pennell, G. A. Pope, T. J. Dekker and D. J. Luning-Prak --$g3.$tA Quantitative Structure - Activity Relationship for Solubilization of Nonpolar Compounds by Nonionic Surfactant Micelles /$rChad T. Jafvert, Wei Chu and Patricia L. Van Hoof -- • 508: Creation/production credits • 508 $aMusic, Laxmikant Pyarelal. • 511: Participant or performer • 511 1- $aJohn Wayne, Maureen O'Hara, Ben Johnson, Harry Carey, Jr., Chill Wills, J. Carrol Naish, Victor McLaughlin, Claude Jarman, Jr. • 538: System details note • 538 -- $aSystem requirements: Windows Vista, 2GB RAM, DVD-ROM drive. • 586: Awards note • 586 -- $aNational Book Award, 1981
Hurdles: Technical • FRBR “Work” records • Format of available tools • Hierarchy tables • Value tables
Work records • No public repository of “work level” records exists • OCLC Research working on this issue • FRBR models for library catalogs • OCLC: embedding OCLC “work numbers” in bib records based on 001/035 match • LC: based on LC NAF • VTLS: manual connections made during cataloging operation • Depth questions: to translations, to editions, or to everything (print, MF, video)?
Tools: Print only • National bibliographies • Print tools for libraries without financial resources to use paid http services • CIP • Dewey classification • AACR2 • ASIS/T thesaurus!
Tools: Electronic, no Web services • ClassWeb (LCC, LCSH, Juvenile SH) • Cataloger’s Desktop (AACR2, LCRI, SCM, CONSER documentation, MARC formats) • HTML tables/databases • Authority files • MARC format documentation • Thesauri • Library catalogs
Hurdles: Economic • Library & vendor personnel trends • Library funding trends • Ownership of tools
Hurdles: Economic -- Personnel • Recent personnel trends in original cataloging are away from professional subject/language specialists, towards paraprofessional generalists • Copy cataloging increasingly outsourced • Production expectation trends in libraries are paralleled in service agencies • Cost and competition for programming/IT personnel
Hurdles: Economic – Ownership • Authority files- LC, NLM, OCLC, Getty, IMDb • Bibliographic records- OCLC, LC, BM, individual libraries • Book jackets/TOC/Reviews- Amazon, Syndetics Solutions, BNA • Dewey Decimal Classification- OCLC • LCC/LCSH/LCNAF- LC • Thesauri- LC, NLM, Getty
Hurdles: Cultural • Liberal values, conservative actions • Library tradition of sharing (ILL, Union Lists, national libraries, MARBI, JSC) • The worship of MARC • Librarians feel proprietary about data that they paid to create (OCLC, Z39.50) • Resources on the World Wide Web should be free
What free electronic services ARE available? • DDC summaries (Excel, OCLC): http://www.oclc.org/research/researchworks/ddc/terms.htm • GSAFD: Guidelines on Subject Access to Individual Works of Fiction & Drama (MARC/XML/ASCII, Gary Strawn): http://www.library.northwestern.edu/public/gsafd/ • MeSH (MARC/XML/ASCII, NLM): http://www.nlm.nih.gov/mesh/filelist.html • Newspaper genre list (XML, OCLC): http://www.oclc.org/research/projects/termservices/resources/ngl.htm • WorldCat Identities http://orlabs.oclc.org/Identities/ • Z39.50, SRW/SRU
Coming… • NSDL (National Science Digital Library) Registry http://sandbox.metadataregistry.org/vocabulary/list.html • Publisher name server http://www.oclc.org/research/projects/publisherns/ • Works records (OCLC) • VIAF (Virtual International Authority File)- Deutsche Nationalbibliothek, Library of Congress, Bibliothèque nationale de France, and OCLC based on SKOS & OWL
Find out more • Greenberg, Jane, & Méndez, Eva (Eds.). (2007) “Knitting the semantic Web”. Cataloging & classification quarterly, 43(3/4). • Hillman, Diane. (2007) “Structures and standards for our bibliographic future”, presentation for LC Working Group on the Future of Bibliographic Control, Chicago, IL, 9 May 2007. Available at: http://www.loc.gov/bibliographic-future/meetings/docs/hillmann-may9-2007.ppt • Tillett, Barbara B., & Harper, Corey. (2007) “Library of Congress controlled vocabularies, the Virtual International Authority File, and their application to the Semantic Web”. Available at:http://www.ifla.org/IV/ifla73/papers/147-Tillet_Harper-en.pdf • Vizine-Goetz, Diane. (2004) “Making knowledge organization schemes more accessible to people and computers”. OCLC newsletter, 266. Available at: http://www.oclc.org/news/publications/newsletters/oclc/2004/266/downloads/research.pdf