480 likes | 626 Views
Advances in Library Discovery Services. Marshall Breeding Director for Innovative Technology and Research Vanderbilt University Library Founder and Publisher, Library Technology Guides http://www.librarytechnology.org/ http://twitter.com/mbreeding. The State of the Art in 2011.
E N D
Advances in Library Discovery Services Marshall Breeding Director for Innovative Technology and Research Vanderbilt University Library Founder and Publisher, Library Technology Guides http://www.librarytechnology.org/ http://twitter.com/mbreeding The State of the Art in 2011 May 20, 2011 Internet Librarian 2011
Abstract • Marshall Breeding will provide a look in to the next generation of library catalogs. The initial phase of next-generation catalogs extended beyond the capability of the ILS online catalog module with relevancy-based search, faceted navigation, and extended scope. The current wave of discovery systems extends search to Web-scale capacity, addressing library subscriptions of scholarly content at the article level in addition to local physical and digital collections.
Evolution of library collection discovery tools • Bound handwritten catalogs • Card Catalogs • Library online catalogs – OPACs • Next-Gen Catalogs / Discovery interfaces • Social Discovery • Web-scale discovery services • Comprehensive presentation layer services
Bound Catalog National Library of Colombia
Card Catalog National Library of Argentina
Card Catalog University of Kansas Library
Online Card Catalog Salem International University
Online Catalog ILS Data Search: Search Results
Discovery Interface ILS Data Digital Collections Search: Local Index ProQuest Search Results EBSCOhost MetaSearch Engine … MLA Bibliography ABC-CLIO Real-time query and responses
Web-scale Discovery ILS Data Digital Collections Search: ProQuest EBSCOhost Search Results Consolidated Index … MLA Bibliography HathiTrust Pre-built harvesting and indexing
Legacy ILS Model / Extended Discovery Discovery Service Search: ` Digital Collections Search Engine Consolidated index LMS ProQuest API Layer EBSCOhost … JSTOR Other Resources
Web-scale Search + Federated Search ILS Data Digital Collections Search: ProQuest … Consolidated Index Search Results MLA Bibliography ABC-CLIO Pre-built harvesting and indexing FedSearch Non-harvestable Resources Interim model to deal with resources not possible to harvest into consolidated index
Encore Synergy ILS Data Digital Collections Search: Local Index ProQuest Local Index Results … EBSCOhost Remote Search Results … MLA Bibliography Web Services Local Index Results ABC-CLIO
Social Discovery ILS Data Search: Digital Collections Local Index Search Results Web site data … User Contributed Content
Unified Search Model ILS Data Search: Digital Collections Discovery Index Search Results Web site data … ConsolidatedIndexes of Articles User Contributed Content
Library Web Presence Public Interfaces: Presentation Layer SubjectGuides Integrated Library System Library Web site Article, Databases,E-Book collections
New Library Management Model Discovery Service Search: Self-Check /Automated Return Library Management System ` Digital Coll Search Engine Consolidated index ProQuest API Layer StockManagement EBSCO … Enterprise ResourcePlanning Smart Cad / Payment systems JSTOR LearningManagement AuthenticationService Other Resources
Discovery from Local to Web-scale • Initial products focused on technology • AquaBrowser, Endeca,Primo, Encore,VuFind • Mostly locally-installed software • Current phase focused on pre-populated indexes that aim to deliver Web-scale discovery • Summon (Serials Solutions) • WorldCat Local (OCLC) • EBSCO Discovery Service (EBSCO) • Primo Central • Encore with Article Integration
Social Discovery • Builds on modernized library catalog interfaces • Strong emphasis on Web 2.0 concepts • Users invited to contribute reviews, ratings, preferences, reading lists, etc. • User-supplied data becomes part of the discovery process • Users help each other to find interesting library materials • Example: Leverage use data for a recommendation service of scholarly content based on link resolver data: Ex LibrisbX service
Differentiation in Discovery • Products increasingly specialized between public and academic libraries • Public libraries: emphasis on engagement with physical collection • Academic libraries: concern for discovery of heterogeneous material types, especially books + articles + digital objects
Continued emphasis on Index-based search • Serials Solutions: Summon • Ex Libris: Primo Central • OCLC: WorldCat Local • EBSCO: EBSCO Discovery Service • [Innovative: Encore Synergy]
Adoption trends • Great interest by academic libraries in Summon, EDS, Primo Central, WorldCat Local • Public Libraries: BiblioCommons adopted by major municipal libraries and consortia • Vendor specific discovery: LS2 PAC, Enterprise, Encore, Axiel Arena, Infor Iguana • AquaBrowser currently loosing ground • New SaaS version from Serials Solutions
Association of Research Libraries www.librarytechnology.org/arl-discovery.pl
Pre-populated discovery indexes • New-generation interface • Harvested local content • ILS metadata • Institutional repositories, ETDs, Digital Collection platforms • Vendor-supplied indexes of library content • E-journals, databases, e-books • Full-text and metadata corresponding to e-content subscriptions • Book collections beyond local library collections
The Battle of the Mega Index • Working toward comprehensive representation of potential library content: ~1 billion items • Well within the thresholds of the capacity of modern search engine technologies • Apache SOLR used by most
Building the Index: Business strategies • Deals with publishers and providers to expose metadata and full-text for discovery • Interesting relationship among discovery service providers • Publishing business: Serials Solutions (ProQuest), EBSCO • Technology business: Ex Libris, OCLC (?) • Serials Solutions: ProQuest content + growing array of third party content • EDS: EBSCOhost content + growing array of third party content • OCLC & Ex Libris: Indexes built entirely out of third party content
The Challenge for Open Source • Open source discovery interfaces: • VuFind (Villanova University) • Blacklight (University of Virginia) • No open content mega index • Discovery has shifted from primarily a technology productto a content-driven product
Discovery Services and Publishers • Discovery services based on a central index depend on publishers and other content providers to cooperate in providing access to metadata or full text data • Not a publishing model – Users access content through publisher site
What’s in the Index? • Important to understand what resources from a libraries collection components are represented or not in their discovery service • Point of differentiation in selecting a discovery service • Point of differentiation in selecting content
Open Discovery Initiative • Project underway to address issues related to information providers, discovery service providers, and libraries • Protocols for transfer of content • Transparency of what is transferred and indexed • Rights or restrictions on how discovery services use content • Initial meeting at ALA Annual • Proposal under consideration by NISO • “Proposed New Work Item: Standards and Best Practices for Library Discovery Services Based on Indexed Search”
Citations / Metadata > Full Text • Citations or structured metadata provide key data to power search & retrieval and faceted navigation • Indexing full-text of content amplifies access • Important to understand depth indexing • Currency, dates covered, full-text or citation • Many other factors
HathiTrust: • HathiTrust will expose SOLR index to discovery providers (Summon, Primo Central, WorldCat Local, EDS) • Introduces full-text book search into discovery services • A total of 8.4 million volumes • 4.6 million books • 200,000 serial titles • 3 billion pages of text
Challenge for Relevancy Technically feasible to index hundreds of millions or billions of records through Lucene or SOLR Difficult to order records in ways that make sense Many fairly equivalent candidates returned for any given query Must rely on use-based and social factors to improve relevancy rankings
From Discovery to Management • Serials Solutions: Summon > Web-scale management Solution • OCLC: WorldCat Local > Web-scale management Solution • Ex Libris: Primo > Alma
Re-coupled Discovery? • Decoupled interfaces emerged from broken online catalogs • Poor interfaces, inadequate scope • Inefficient integration between automation and discovery platforms • New wave of more tightly integrated suites: • Alma > Primo • Web-scale Management Services > WorldCat Local • Serials Solutions Web-scale Management Solution > Summon • Still possible to decouple, but more effort, worse results
Integration with e-book lending services • Current environment reflects weak integration: • Library catalog populated with MARC records representing e-book collection • Library users linked into e-book vendor site • Uses ILS patron authentication for patron validation and authorization • Need to move to deeper integration with more seamless user experience
Next-Gen Library Catalogs Marshall Breeding Neal-Schuman Publishers March 2010 Volume 1 of The Tech Set