550 likes | 676 Views
Enhancing Access to Databases: MultiSearch and Database Relevancy— the Integration of Two Collaborative Projects. Shirley Rodgers James Jackson Sanborn. Database Access Problems . Locating and selecting appropriate database
E N D
Enhancing Access to Databases:MultiSearch and Database Relevancy—the Integration of Two Collaborative Projects Shirley Rodgers James Jackson Sanborn Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Database Access Problems • Locating and selecting appropriate database • Multiple searches through multiple database interfaces resulting in multiple result sets Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Old Database Approach Access to databases was clunky and non-intuitive. • Alphabetical list • Subject lists that were long and also alphabetical Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Old Subject Page Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Database Database Database Etc. interface interface interface interface search search search search Multiple Search Problem Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Multiple Search Problem Patrons demanded solution • Old “Locate Databases by Keywords” • 79% of searches failed (>6k) • geodesic domes • stem cell and optical nerve • goat milk spider silk • factors that explain marital happiness when spouse lives in nursing home Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Two Problems, Two Solutions Database Relevancy Project Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Database Relevancy Goals: • Intuitive display of databases • Improved subject access • Maintainable solution • Leverage existing data Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Database Relevancy Plan: • Sort databases by relevancy within subject area • Provide additional information for databases ‘important’ to a subject area • Automatically generate lists Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Technical Details Data: • Drawn from catalog • MyLibrary Subject Headings (690 $x) • Descriptive notes (520 $a) • URL (856 $u) • Three levels of relevancy assigned • Core, Narrow, Broad (690 $R) • Assigned at the subject level Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
MARC transformed to XML using Perl Module MARCPM Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
XML transformed multiple times using XSLT - processed through Saxon, called by brief Perl scripts. Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Why XML • Much easier to manipulate using XSLT than using Perl to directly manipulate MARC • Simpler to use than importing MARC into a 2nd database and using ColdFusion • Easy to test on desktop then move to production Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Limitations of XML/XSLT • Multiple versions of MARC.XML • XSLT has limited string processing functionality • Need Perl to handle multiple file generation based on hash value pairs Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Detail of Record Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Detail of Record Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Collaboration • Stakeholders brought in early • Subject specialists from Collection Management and Reference • Gave input on “look and feel” issues and functionality • Given final say on database relevancy • Technical development in DLI and Systems departments Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Two Problems, Two Solutions MultiSearch Project Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
The beginning of MultiSearch • BlueAngel MetaStar for indexing in-house collections and GIS • Wanted to learn Java and JSP • Testing it with other Z39.50 servers Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
The beginning of MultiSearch • Prototype of cross-searching 2 major database vendors • How many vendors support Z39.50? Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
How can I use what I prototyped? • Static list of databases – subject and alphabetical • Database relevancy pages created using XML/ XSLT • JSP can access XML files • The projects came together! Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
How is XML used? • JSP Xtags can access XML files • Subject pages use XML and XSLT to display information <xtags:style xml='<%=xmlfile%> ‘xsl='<%=xslfile%>'/> Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Databases listed using XML/XSLT Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
How is XML used? • List of databases to search created from parsing XML file <xtags:forEach select="//record"> <xtags:variable id="url856" type="string" select="field[@type='856']/subfield[@type='u']"/> <xtags:variable id="dbtitle" type="string" select="./field[@type='245']/subfield[@type='a']"/> Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
How is XML used? • List of databases to search created from parsing XML file <xtags:forEach select="//record"> <xtags:variable id="url856" type="string" select="field[@type='856']/subfield[@type='u']"/> <xtags:variable id="dbtitle" type="string" select="./field[@type='245']/subfield[@type='a']"/> Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
How is XML used? • List of databases to search created from parsing XML file <xtags:forEach select="//record"> <xtags:variable id="url856" type="string" select="field[@type='856']/subfield[@type='u']"/> <xtags:variable id="dbtitle" type="string" select="./field[@type='245']/subfield[@type='a']"/> Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Search targets obtained from XML file using Xtags Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Querying the Z39.50 targets is easy! Working with the data you get back is another story! Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Vendor differences • Authentication • Username/passwords • IP authentication • Z39.50 attributes • Word & WordList • Any & Anywhere Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Vendor differences Data Formats • Marc Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Vendor differences Data Formats • SUTRS (Simple Unstructured Text Record Syntax ) Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Vendor differences Data Formats • SUTRS (Simple Unstructured Text Record Syntax ) • Requires special processing to parse the “blob” and display the data • Can’t merge, de-dup or sort these records Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Vendor differences Source information (773 field) • Contains the journal title, ISSN, year, volume, issue, and pages • Used for E-Journal Finder and SFX • Vendors use different subfields for this information Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Vendor differences Source information (773 field) 773$t Pet Product News 773$x 0899-2177 773$g May 1997, v51, n5, p64(2) Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Vendor differences Source information (773 field) 773$x 0003-0031 773$t American-Midland-Naturalist. 2003, 149: 1, 104-120; 39 ref. Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Vendor differences Source information (773 field) title period year 773$x 0003-0031 773$t American-Midland-Naturalist. 2003, 149: 1, 104-120; 39 ref. volume issue pages Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Vendor differences Source information (773 field) Aquatic Toxicology [Acquit. Toxicol.]. Vol. 59, no. 3-4, pp. 163-175. 24 Sep 2002. Review of Palaeobotany and Palynology, 119 (1-2) pp. 93-112, 2002 Indian-Journal-of-Animal-Sciences. 2002, 72: 12, 1122-1124; 10 ref. History-and-Theory. My 02; 41(2): 250-263 Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Vendor differences Source information (773 field) Challenge – Get from this: 773$x 0003-0031 773$t American-Midland-Naturalist. 2003, 149: 1, 104-120; 39 ref. Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
To This: Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
How is this accomplished? • Study patterns in the 773 field for the database • Write SFX source parsers for each format to parse the 773 field into separate field for ISSN, ISBN, volume, issue, start page and end page Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
How is this accomplished? • Store parser name for each database in a database • Lookup parser name and pass it to SFX in the OpenURL sfx.lib.ncsu.edu:9003/ncsu? sid=MULTISEARCH:zsilver2&issn=1068-5472 &isbn=&atitle=Phalaenopsis+orchid+plant+named+%27Anthura+Gold%27.&pid=US-pat-Plant.+%5BWashington%2C+D.C.+%3A+U.S.+Patent+and+Trademark+Office%2C+1976-.+May+21%2C+2002.+%2812%2C639%29+3+p. Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Vendor differences Full Text • 856$u - link or pdf • 900$a Magazine: Horticulture, December 2002 900$a SLIP INTO THE HOLIDAYS 900$a Whether you're in a mood to celebrate or not, Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Rollout of Service • Group created to design page layouts and functionality • Decided to display all databases on results page, not just ones with Z39.50 search capabilities • Provide link to search the non-Z39.50 databases directly Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Rollout of Service • Load tests to measure performance with more users • Production – August 19th, 2002 Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
MultiSearch & Database Finder Usage Statistics • April 2003 - Hits • Homepage 272,583 • Database Finder 44,813 • Subject Pages 38,676 • MultiSearch 13,372 Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
Post Rollout • Continued to work to add other vendors with Z39.50 access • Changed the look of the subject page to make MultiSearch more noticeable Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn
MultiSearch Version 2.0 • Converted from E-Journal Finder to SFX • Advanced Search – allow users to select databases to search • Merging, sorting, and de-duping results Enhancing Access to Databases – LITA Forum, Norfolk 2003 Shirley Rodgers and James M Jackson Sanborn