360 likes | 558 Views
E-Journal List to OpenURL Resolver. Kerry Bouchard Asst. University Librarian for Automated Systems Mary Couts Burnett Library Texas Christian University. Floundering Epoch – (pre-1998).
E N D
E-Journal List to OpenURL Resolver Kerry Bouchard Asst. University Librarian for Automated Systems Mary Couts Burnett Library Texas Christian University
Floundering Epoch – (pre-1998) • Publishers begin sending our Serials librarian “this publication is now available online at…” notices. Serials has to be involved in setting up titles. • Systems cobbles together a URL-rewriting proxy program to provide off-campus access to databases and e-journals (replaced by Ezproxy in 2000) -- Systems needs to be involved in setting up titles. • Introduction of web-based online catalog makes linking to online journals from bib records possible -- Cataloging needs to be involved in setting up titles. • We begin subscribing to services like JSTOR and Project Muse. More titles pile up. • Our serials librarian, Janet Douglass, founds the “E-Team” to coordinate policies and procedures for tracking e-journals.
Hunter-Gatherer Era (1998-2001) The “E-Team” begins tracking subscription e-journals. • Relational database of titles built. We record: • URL for the journal main contents (volumes/issues) pages • A local PURL. • Authentication method. With databases there is some variation, but 98% of our e-journals use IP recognition. Off-campus users are routed through proxy. • Subject assignments. Initially assigned automatically based on call number. Quickly saw need to assign multiple headings to titles, so added a subject table. • At the end of the E-Team assembly line, cataloger adds the PURL to the bib record (i.e., our bib record URLs are PURLs pointing to the local relational database.)
…Hunter-Gatherer Era (1998-2001) Example PURL: http://lib.tcu.edu/PURL/ejournal.asp?JSTOR:00030031 • Script for processing all e-journal links • Vendor identifier – script uses this to look up data on proxy method required for off-campus access, and to record a “click count” for the source in our usage log table • Journal identifier (usually ISSN)
…Hunter-Gatherer Era (1998-2001) E-Team does not attempt to track: • Dates of coverage for each source • Free publications, except a handful specifically requested by faculty. • The thousands of full-text titles “hidden” inside full text databases like Academic Search Premier.
No dates of coverage, just general notes Sample Title List
Vocabulary Break You say “aggregator” Maybe all we really care about are the differences between… I say “fulltext database” Is OCLC ECO an “aggregation” or a database?
Staff Perspective: Do we get all titles from this source? • Examples: • Citation databases that contain full text for some / all journals cited (“Expanded Academic Index”, “Academic Search Premier”, etc.) • License agreements for everything available from a collection like Project Muse Or Only a Selection of Titles? • Examples: • E-journal collections like OCLC ECO, Highwire Press; JSTOR; IEEE journals, etc. Doesn’t really matter if all the titles in the collection are from the same publisher, since we don’t necessarily subscribe to everything from a given publisher either.
Agriculture Invented: Fulltext Journal Locator v1.0 Our then computer services librarian James Lutz downloads lists of journal holdings from various fulltext databases and sets up SQL “union query” to merges all these lists together with the collections tracked by the “E-Team”:
Agriculture Invented: Fulltext Journal Locator v1.0 Downloaded Links(take user to database search screen) Links to E-Team Records(take user to list of available volumes/issues)
…Fulltext Journal Locator v1.0 • Vendor journal lists are inconsistent – different vendors supply different info, in different formats. • Some citation databases list every journal they *cite* -- many links turn out not to be full text. • Our local “E-Team” list doesn’t have dates of coverage – these links look odd next to the vendor records that do have dates • Finally concluded that it was too much work to keep the lists up to date.
Agribusiness: We pay someone to harvest e-journal info for us. • Summer of 2001 we begin subscribing to SerialsSolutions journal tracking system. Every two months they send an updated list of our e-journal holdings from all our sources. This: • Eliminates the need to manually download holdings lists from all our fulltext database vendors – everything is now in one file. Locator data can now be relied on to be accurate. • Gives us dates of coverage for all the collections tracked by the “E-Team” • Fall 2002 we add a MARC record feed to our S.S. contract. • Means that we have MARC records with URLs for about 25,000 titles (35,000+ links) that were not tracked by the “E-Team”.
…Now our old list is obsolete and incomplete • At this point we have around 25,000 e-journal titles in our catalog and our Fulltext Journal Locator, versus the 2,000 titles entered with subject headings by the “E-Team”. • E-Team replaces the alphabetical-by-title and subject lists of e-journals pages on the library web site with pages telling users to search the catalog or fulltext journal locator, since these lists are much more complete.
…But people really liked the old lists • Several faculty and graduate students strongly object to having these lists removed.They don’t feel that searching the online catalog is an adequate substitute for being able to browse the titles in their field. • LC subject headings don’t always work well for browsing -- no consistent mapping of specific to broad headings. • If our online catalog software had the capability to retrieve all titles in a range of call numbers, that would partly address the problem. • There would still be the problem that we want to assign more than one broad subject category to a single title in many cases, and that’s not possible if the subject assignments are derived from call numbers.
…So we put the lists back • The E-Team reinstates the title and subject lists on the web site, with modifications to the underlying relational database so that: • We can differentiate sources that allow browsing by volume and issue from sources that take users to a search screen. Only sources that allow browsing by volume and issue (the same ones previously tracked by the E-Team) are displayed on the title and subject lists. Data from SerialsSolutions Our local list of “browsable” sources. “SSName” field links to S.S. “Provider” field.
…So we put the lists back • We can continue assigning non-LC subject categories to titles, so that departments can browse “their” lists. (Subject categories are linked to ISSN numbers, so once we assign a subject heading to one source of the title, additional sources of the same journal automatically get mapped to the same headings.) Subject codes (“Geology”, etc.) Links codes to ISSNs ISSNs link to SerialsSolutions Records
Citation databases begin offering linking capabilities. Some are explicitly OpenURL based, and others are not. Either way it is possible to make journal-level linking work by simply modifying the Fulltext Journal Locator script so that it can search by ISSN as well as by words in the title. (Script will search by two ISSN’s – print and electronic – if the citation links provide them.) Not all the journals in the S.S. data have ISSN’s at however.
Citation with “OpenURL like” link URL for the link above: http://lib.tcu.edu/PURL/OpenURL.asp?genre=journal&ISSN=0009-2541&DT=20030615&TI=Carbon%20isotope%20exchange%20rate%20of%20DIC%20in%20karst%20groundwater%2E&JN=Chemical%20Geology&VI=197&IP=1-4&AU=Gonfiantini%2C%20Roberto&spage=319 (example from Academic Search Premier) ISSN is all we need for journal-level link
Front end OpenURL resolver At this point we provide separate links for the catalog and locator, because MARC records aren’t yet available for all the titles in the locator. When that changes, a separate full text journal locator will no longer be needed, for journal level linking. Link to search Fulltext Journal Locator Link to search online catalog http://libnt4.lib.tcu.edu/PURL/JournalLocator.asp {issn passed in hidden form field}
OpenURL Resolver Result Screen URL in link above:http://lib.tcu.edu/PURL/ejournal.asp?ScienceDirect:http://www.sciencedirect.com/web-editions/journal/00092541 Used to retrieve local info about TCU’s ScienceDirect subscription
Article Level Linking The OpenURL provided by Academic Search Premier in the previous example contained all the information needed for building an article-level link: http://lib.tcu.edu/PURL/OpenURL.asp?genre=journal&ISSN=0009-2541&DT=20030615&TI=Carbon%20isotope%20exchange%20rate%20of%20DIC%20in%20karst%20groundwater%2E&JN=Chemical%20Geology&VI=197&IP=1-4&AU=Gonfiantini%2C%20Roberto&spage=319 ISSN – tells us it’s a journal and which one DT – date the article was published, in YYYYMMDD format TI – title of the article (may or may not be needed, depending on how the target system builds article-level links) VI – volume IP – issue AU – author (required by some target systems)
…Article Level Linking • Two enhancements to the Fulltext Journal Locator / OpenURL Resolver script are required to make article-level linking work: • A date normalization routine, so that we can compare the date an article was published to dates of coverage for each potential source of the article • A function to convert meta-data in OpenURL format to a format that can be used to create an article-level link for a given source. Each vendor may require it’s own function (example to follow).
Date normalization Dates supplied in the OpenURL links are already normalized (in YYYYMMDD format). However, dates we get in our S.S. data feed appear in a variety of ways: Winter 2002 12/17/1998 1 month ago {as in, “2001 to 1 month ago”} [blank] {a blank End Date field means “to present”} 1999 {no month supplied} Fortunately, there are only a small number of variations like this, so it was not too difficult to write a date normalization routine that converts the SerialsSolutions supplied starting and ending dates for a publication to YYYYMMDD format; these can then be compared to the date of publication supplied in an OpenURL to see if we have full text for the issue in which the article was published.
Example metadata parser - JSTOR The JSTOR web site contains documentation for building article-level links to JSTOR content using SICI codes. To construct a SICI, we take the OpenURL metadata elements we ignored earlier and convert them into a SICI. So the OpenURL data: genre=journal&ISSN=0002-9475&DT=19950901&TI=Ancient%20anagrams%2E&JN=American%20Journal%20of%20Philology&VI=116&IP=3&AU=Cameron%2C%20Alan&spage=477 Becomes: sici=0002%2D9475%2819950901%29116%3A3%3C477%3A%3E2%2E0%2ECO%3B2%2D%23&origin=tcu
…Example metadata parser - JSTOR And the user sees: Article-level link using SICI Journal-level link using old method Project Muse also supports article-level linking, but in this case the date parser saw that Muse would not have full text for an article published in 1995.
Article-level linking issues • Journal source must not only support some kind of article-level linking, they need to use links that can be derived from metadata. For example, an accession numbering system (e.g., “http://www.somevendor.com/journals/articles/23090238762.html”) won’t work. • Would be difficult use MARC records for building article level links: • Much more difficult to extract and normalize dates as they’re presented in the MARC 856 fields (MARC fixed date fields can’t be used, they’re the date range for the publication itself, not the dates of coverage for a particular source): • Scripting tools (PHP, ASP, Perl, etc) make it much easier to pull data from a relational database table than from a MARC record (at least with our current ILS software).
Modern era? (Agribusiness crushes the small farmer) At least two vendors we work with have announced article-level linking solutions that may eliminate the need for local work on this. So the Systems Librarian has to shift back from writing cool metadata parsers to attending committee meetings.
…But maybe there’s still some work before we send the farmer off to the knackers… • Link from the OpenURL resolver to our Interlibrary Loan system when we don’t have access to online or print. • Still a need to track some titles and sources locally and integrate these into the hitlists for the OpenURL resolver, since there are still some sources that aren’t tracked by our e-journal info vendor. • Need to keep up with “fulltext database” vendors as they add journal-level/article-level linking capability, so that we can change our URLs to take advantage of that instead of sending users to a search screen.
In Conclusion… • If you have a list of your e-journals in a relational database • Either with data you entered yourself • Or data you get from a vendor tracking your holdings for you • And if the info on each e-journal includes an ISSN • And if someone on your staff knows how to write web server scripts that retrieve info from the e-journal database… • Then you can turn your e-journal list into an OpenURL resolver with about 20 minutes work. (Journal level links – article-level linking takes longer.)
Kerry Bouchard Asst. University Librarian for Automated Systems Mary Couts Burnett Library Texas Christian University Email: k.bouchard@tcu.edu This presentation online:http://lib.tcu.edu/staff/bouchard/OpenURL/Amigos2003.htm