420 likes | 547 Views
Endeca @ NCSU Libraries. Kristin Antelman NCSU Libraries June 24, 2006. Overview. The problem Quick demo Technical overview Implementation process Use data Assessment data Next steps. Why did we do this?. Existing catalogs are hard to use:
E N D
Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Overview • The problem • Quick demo • Technical overview • Implementation process • Use data • Assessment data • Next steps
Why did we do this? • Existing catalogs are hard to use: • known item searching works pretty well, but … • users often do keyword searching on topics and get large result sets returned in system sort order • catalogs are unforgiving on spelling errors, stemming NO RELEVANCY!
Catalog value is buried • Subject headings are not leveraged in searching • they should be browsed or linked from, not searched • Data from the item record is not leveraged • should be able to filter by item type, location, circulation status, popularity
What does the Endeca software do? • Provides search software for ecommerce companies • Faceted browse of structured metadata; goal is to expose the ontology
Endeca technical overview Endeca Information Access Platform NCSU exports and reformats Data Foundry MDEX Engine Parse text files Raw MARC data Indices Flat text files HTTP HTTP NCSU Web Application Client browser
Integrating Endeca - Enhancements • MarcAdapter plugin for raw MARC data. • Eliminate need for external MARC 21 translation and file merging • Partial Updates • Update circulation data multiple times throughout the day
Implementation process • Timeline • License / negotiation: Spring 2005 • Acquire: Summer 2005 • Implementation: August 2005 – January 12, 2006 • 7 representative team members • functional requirements, metadata, interface issues (total of 40-60 hours) • project manager: approximately 10 hours per week for 20 weeks • Java-trained librarian (30-40 hrs/wk for 14 weeks) • It doesn’t have to be perfect!
Key decision points • Search interface
Main search page Endeca Web2
A few major issues • Search interface • Selecting dimensions and their order
9. Availability 10. Library of Congress Classification • Subject: Topic • Subject: Genre • Format • Library • Subject: Region • Subject: Era • Language • Author Dimensions
A few major issues • Search interface • Selecting dimensions and their order • Defining the relevance algorithm
Relevance defined • Relevance ranking in Endeca – select from a variety of modules and order them based on importance • At NCSU… • Original query term(s) (no thesaurus, stemming, spell correction) • Exact phrase match • Field ranking (Title higher than Author higher than Table of Contents, etc.) • Number of fields that contain term(s) …
Some user reaction “The new Endeca system is incredible. It would be difficult to exaggerate how much better it is than our old online card catalog (and therefore that of most other universities). I've found myself searching the catalog just for fun, whereas before it was a chore to find what I needed.” - NCSU Undergrad, Statistics “The new library catalog search features are a big improvement over the old system. Not only is the search extremely fast, but seemingly it's much more intelligent as well.” - NCSU faculty, Psychology
Testing relevance • Are search results in Endeca more likely to be relevant to a user’s query than search results in Web2 OPAC? • 100 topical user searches from 1 month in fall 2005 • How many of top 5 results relevant? • 40% relevant in Web2 OPAC • 68% relevant in Endeca catalog
Future plans • FRBR-ized displays • FAST (Faceted Access to Subject Terms) instead of LCSH • Enrich records with supplemental content • More integration with website search • Use Endeca to index local collections
Thank you project page: www.lib.ncsu.edu/endeca