240 likes | 388 Views
Data harvesting with JavaScript to enhance record display in the OPAC. Doug Eriksen - Seattle University. May 17 - 20, 2009. Agenda. What can you harvest? How do you harvest it? Why harvester.js instead of WebBridge/Pathfinder Pro? What can you do with the data? Examples
E N D
Data harvesting with JavaScript to enhance record display in the OPAC Doug Eriksen - Seattle University May 17 - 20, 2009
Agenda • What can you harvest? • How do you harvest it? • Why harvester.js instead of WebBridge/Pathfinder Pro? • What can you do with the data? • Examples • Implementation Details
What data can you harvest? • Standard numbers • Title • Author • Persistent URL • Logged-in patron name • Almost anything, if you can think of a use for it
JavaScript harvesting • harvester.js - Client-side script scrapes the page for data and loads it into global variables • Custom wwwoptions create global variables • manipulator.js – inserts content built from the variables into empty placeholders built into bib_display.html, botlogo, or toplogo
Why not WebBridge/Pathfinder Pro? • Can’t harvest all the same data • Can’t insert content into arbitrary locations on the bib_display.html page • Limited to <!--{resourcetable}--> • Can’t be inserted inside toplogo or botlogo • Can’t do Google Books previews or covers
What can you do with the data? • Pass data to another web site or service • Offer Google Books previews, cover images, formatted citations, custom exporters • Build custom links in the manner of WebBridge/Pathfinder Pro • Rewrite and improve your page • Better page title • Better print display • Better bookmarks
Examples • Google Books viewer • WorldCat.org links • Rewrite your page title • Social bookmarking • Custom print view
Google Books button This button only appears if a scan of the book (full or partial) is available for preview
Google Books Viewer This viewer loads right on top of you page in a lightbox. No need to send users to another page, or open a pop-up window.
WorldCat.org links • With a title’s OCLC# you can construct URLs that lead to four very useful resources on WorldCat.org • Other Editions • Formatted Citations • RefWorks Export • Citation File (Endnote) Export
Construct a WorldCat.org URL • For the OCLC# 63186207 these are the URLs: • http://summit.worldcat.org/oclc/63186207/editions/ • http://worldcat.org/oclc/63186207?page=citation • http://worldcat.org.proxy.seattleu.edu/oclc/63186207?page=refworks – Note that I ran this link through my proxy server so RefWorks will recognize my off-campus users • http://worldcat.org/oclc/63186207?page=endnote • Easily built using the bib_OCLC global variable populated by harvester.js • Caution: OCLC could change these URLs anytime
Rewrite your page title • You can customize the <title> element on some pages, but many pages get a system generated <title> • In my catalog they look like this: • <title>Seattle University Libraries/Lemieux</title> • On a bibliographic display I would like to use the title from the record as my page title
Rewrite your page title • <script type=“text/javascript”> document.title = "SU library - " + bib_title; </script> Before After
Social Bookmarking • Customize a social bookmarking tool, such as the AddThis button • Use the persistent URL instead of the current URL, use the page title of your choice
Custom Print View Default record display in my OPAC
Custom Print View Print view from my OPAC, created with a print style sheet and JavaScript to insert the patron name, and the persistent URL
Implementation Details • wwwoptions • The wwwoption LOGGEDIN_MSG typically looks like this: • LOGGEDIN_MSG=You are logged into %s as %s • Replace it with: • LOGGEDIN_MSG=You are logged into %s as <script type="text/javascript">var pName = '%s'; document.write(pName);</script>
Implementation Details • wwwoptions • The wwwoption ICON_RECORDLINK typically looks like this: • ICON_RECORDLINK=<div class="bibRecordLink"><a id="recordnum" href="%s" >Permanent URL for this record</a></div> • Replace it with this: • ICON_RECORDLINK=<script type="text/javascript">var url_tail='%s'; var url_full='http://library.seattleu.edu' + url_tail;</script>
Implementation Details • pName = logged in patron’s name • url_full = persistent URL for the record • Where do we get the rest of the data • harvester.js
harvester.js • Scans the page for table rows • Entire page, or div of your choosing • Looks for rows with two cells, and compares the contents of the first cell to a list of labels • When it finds a match for a label it harvests the data from the second cell and places it in a variable
harvester.js • Configuration variables • var label_tableContainer="fullRecord"; • var label_ISBN="ISBN"; • var label_ISSN="ISSN"; • var label_title="Title"; • var label_author="Author"; • var label_LCCN="LCCN"; • var label_OCLC="Record #”;
Live examples • With code! • http://www.seattleu.edu/lemlib/iug09/
Credits • Tina Ching from Seattle University Law library helped design and present version one of this script. • Adam Brin, from Bryn Mawr, posted a script to the IUG clearinghouse to collects the persistent URL and the book title, and build a social bookmarking button with the AddThis service. It was based on harvesting table rows and scanning them for labels to spot the data. • Erik Still from Boeing Library Services has been a sounding board and early tester for many of my “improvements” since Tina and I first presented this code.
See it in action • Seattle University, and SU Law • http://library.seattleu.edu • University of Oregon • http://janus.uoregon.edu/ • Mt. Hood Community College • http://lrc-srv.mhcc.edu/ • Bowdoin College • http://phebe.bowdoin.edu • Dowling College • http://library.dowling.edu/