510 likes | 726 Views
The Catalogue as Master file. CHLA 2004 Lisa Goddard and Louise White. Catalogue and Website. Most libraries maintain both an on-line catalogue and a website Find ourselves reproducing cataloguing information on the website to provide multiple points of access
E N D
The Catalogue as Master file CHLA 2004 Lisa Goddard and Louise White
Catalogue and Website • Most libraries maintain both an on-line catalogue and a website • Find ourselves reproducing cataloguing information on the website to provide multiple points of access • Common duplication- bib records for databases and database list
Database List and Catalogue • At Memorial we maintained several lists of electronic indexes • “Database Lists” were maintained as static HTML pages , manually updated • Had two lists: One accessed from Homepage, comprehensive; one accessed from Unicorn, selected (catalogued) dbs
Static HTML lists – the Problems • Constant updates • Duplicating information under multiple subject headings • Inconsistent entry style • Catalogue and website maintained by different departments – communications challenge
The Wish List • Wanted a single list – easily done but we wanted more • Wanted one list made easy • Wanted title and subject access • Wanted to reflect frequent recommendations • Wanted end product designed for novice user but expert friendly
The Source • One authoritative entry point for data –theCatalogue • Did not have to invent entry point, it already existed • Could harvest data and create an easy custom interface for information retrieval
Team Players • Formed an all branch working group • Representatives from: • Cataloguing > MARC record expertise • Reference > controlled vocabulary for subjects and subject classification • Systems > data harvesting and display
Consensus on a Schedule • Set a deadline of August 2002 (it was now April 2002) • Tight timeline kept us focused • Kept in close contact with Library Instruction coordinator who was designing a library research web tutorial
The Final Product • Characteristics • All required elements are included in the MARC record • Title and subject access available • Subject Access tiered – Highly Recommended and Also Recommended • Live URLs for Databases and User Guides
Item Format Title Local Note Coverage Summary EIndex 245 590 856 |z 520 Elements in the Bib Record
Highly Recommended Also Recommended Database URL User Guide URL 691 indicator 1 691 indicator 2 856 indicator 40 856 indicator 42 Elements in the Bib Record
Subject Access • Decided on a master list of subjects • Controlled vocabulary - Combination of Subjects and Disciplines • Some apparent duplication: English literature/Literature and Language • Restricted number of databases which could be Highly Recommended, otherwise dilute notion
What’s in a name? • Point of much discussion • Recognized that “database” was legitimate term for internal use • Needed more descriptive term for external use • Wanted the word “article” in there as most likely to get the attention of users • Wanted to reinforce “index” in users vocabulary
What’s in a name • Settled on Article Indexes • Not unique • The majority of databases are purely indexes • Also lists full text databases, databases of statistics… • Thought about “Article Indexes Plus” but kept hearing “Plus what?” in our own heads
The Catalogue as Master File Part II: Technical Overview – Creating the Interface CHLA 2004 Louise White & Lisa Goddard Memorial University Libraries
Technical Overview: Creating the Interface Perl Script Removes extraneous data from MARC record. Reformats data for load into SQL database. SQL Database Report scheduled to reload table everyday from Perl text file. Can be queried by web interface. Web Server Contains ASP interface which formats data into dynamic web pages Library Catalogue Report runs daily to dump Eindex MARC records as text. This file is auto-matically FTP’d to the web server everyday.
Retrieving the Bibliographic Data • MARC records are created, updated and managed in Unicorn. • Unicorn enforces cataloguing policies and standards to ensure record integrity. • Unicorn applies access rights for record editing and creation.
Why not create the interface within the OPAC? • Database engine does not support SQL queries. • Unicorn uses proprietary, compiled code. • Complicated to create highly customized web interfaces.
Getting the Eindex Records from the Catalogue • Uses the catalogue’s built in reporting tools. • Report selects records with item format “EINDEX” and writes their MARC records to a text file. • Report runs daily so text file is always current. • File automatically FTP’d to web server everyday. FTP Web server Daily Report Eindex MARC Records Catalogue
The Catalogue Output • Text file created by Unicorn report contains well formatted MARC records with tags and indicators: • *** DOCUMENT BOUNDARY *** • FORM=SERIAL • .000. |aas a0c • .001. |aocm38313827 • .003. |aOCoLC • .005. |a20020605111040.0 • .006. |am e • .007. |acr un--------- • .008. |a980204c19639999maumx p si 0 0eng d • .035. |a(Sirsi) o38313827 • .040. |aBBH|cBBH|dOCL|dOCLCQ • .050. 4|aZ7006|b.M64 INTERNET • .130. 0 |aMLA international bibliography (Online : SilverPlatter • .245. 00|aMLA international bibliography|h[electronic resource] • .246. 3 |aModern Language Association international bibliography • .246. 1 |iTitle bar title :|aMLA bibliography|b(Online : Silver
Reformatting the MARC Record • Extract only pertinent fields from the MARC record: Title .245. |a Subject heading – highly recommended .691. 1 |a Subject heading – also recommended .691. 2 |a URL to resource .856. 40 |u Holdings .856. 40 |a Access note .590. |a Description .520. |a URL to user guide .856. 42 |a • Remove all MARC tags, subfield indicators and clutter. • Save the results in a format that can be loaded into an SQL DB.
Reformatting with PERL • PERL (Practical Extraction and Report Language) • Perl is a language that is widely used for creating and manipulating text files. • Uses pattern matching to make decisions. • Perl is free, open source software. It can be downloaded from: http://www.activestate.com/Products/ActivePerl/.
PERL: Selecting the Fields foreach(@lines){ if (@lines[$x]=~/^\.590\./){ print @lines[$x];write it to a text file } if (@lines[$x]=~/^\.691\./){ print @lines[$x]; } $x++; choose line if starts with 590. choose line if starts with .691. write it to a text file
PERL: Remove MARC Tags & Clutter • .691. 2|aBiology (“Also Recommended” for Biology) If line starts with “.691. 2” elsif (@record[$s]=~/\.691\. 2/){ ($junk, $subjother) = split(/\|a/, $line); push(@others, $subjother); $oth++; } Split the line on the “|a” indicator. Keep everything after the “|a” and discard the rest.
PERL: The Final Output • The top line tells the database which fields to create in the table. • title*descrip*access*hold*url*guide*urlguide*subjectrec*subjectoth • The rest of the file contains the data for each field (* is the delimiter). • MLA international bibliography * This database contains references to literature, language, linguistics and folklore. It indexes journal articles, monographs, dissertations, working papers, proceedings, and bibliographies. It is updated 10 times per year.* Online access available to MUN users only.*1963- *http://204.187.104.2:8590/munf9?*User Guide available:*http://www.mun.ca/library/research_help/guides/MLA.html*English Folklore Language and Linguistics Literature Theatre*Aboriginal Studies
Diacritics – Search and Replace • Scheduled event on SQL server corrects diacritics execute eindexes.lgoddard.SearchAndReplace 'aaa','è' execute eindexes.lgoddard.SearchAndReplace 'bbb','é' execute eindexes.lgoddard.SearchAndReplace 'ooo','ô' execute eindexes.lgoddard.SearchAndReplace 'ccc','ç' execute eindexes.lgoddard.SearchAndReplace 'eee','é' execute eindexes.lgoddard.SearchAndReplace 'fff','ê'
SQL: Standard Query Language • Table can be queried using SQL (Standard Query Language). • To get all the resources that are recommended for the subject “Literature”: SELECT * FROM ejournals WHERE subjectrec LIKE ‘%Literature%’; • To get all the titles that start with the letter A: • SELECT * FROM ejournals WHERE title LIKE ‘A%’;
The Web Interface: Dynamic HTML • Eindex search page must accept a search term from the user, and then create and display a custom set of results based on that query. • Dynamic HTML is written on the fly. The page does not exist until requested by the user. • There are several web programming languages that will produce dynamic html files including PHP, JSP, and ASP.
Server Side Languages • ASP/PHP/JSP are server side languages,commonly used to display database results over theweb. • Server side programs do all the processing on the server, and write out HTML files that are sent to your browser. • You do not need any special plug-ins to see these pages.
Web Interface Composed of 3 Files Main Search Page (index.asp) Option B user chooses from subject menu sends variable “subhead” Option A user chooses from title menu sends variable “searchtext” Alphabetical Search Results Page (AlphaSearchResults.asp) Subject Search Results Page (DBSearchResults.asp)
Structure of a Dynamic URL Directory on web server ? Indicates beginning of parameters Value of variable http://www.library.mun.ca/eindex/DBSearchResults.asp?subhead=Biochemistry Server name and domain Name of asp file which will receive variable Name of variable
Main Search Page - Alphabetical Search User selects “A” <a href="alphaSearchResults.asp?SearchText=A">A</a> Name of variable expected by .asp file Value of variable as selected by user Name of .asp file to render resultspage Opening href tag
Main Search Page -SubjectSearch User selects “Biology” <option value="./DBSearchResults.asp?subhead=Biology">Biology</option> Name of .asp file to render results page Name of variable expected by .asp file Value of variable as selected by user
ASP: Subject Search Results • Receives variable “subhead” from index.asp: http://library.mun.ca/eindex/DBSearchResults.asp?subhead=Biology • Makes two SQL queries based on the value of variable “subhead”: • SELECT * FROM eindex WHERE subjectrec LIKE ‘%Biology%’ • Returns all records that have Biology in recommended subject field • SELECT * FROM eindex WHERE subjectoth LIKE ‘%Biology%’ • Returns all records that have Biology in other subject field
ASP: Subject Search Code • ASP writes out the variables with HTML Tags around them: <% records.MoveFirst do while Not records.eof %> <tr><td colspan=2><b><a href="<%=rs("url")%>"><%=rs("title")%></a></b></td></tr> <tr><td><%=rs("access")%><br>Coverage: <%=rs("hold")%></td> <td valign="bottom"><a href="<%=rs("url2")%>">Database Guide</a></td></tr> <tr><td colspan=2><%=rs("descrip")%></td></tr> <tr><td colspan2></td></tr> <tr><td colspan2></td></tr> <% records.MoveNext loop %>
ASP: Output • A line of ASP code that looks like this: • <tr><td colspan=2> • <a href="<%=rs("url")%>"><%=rs("title")%></a></b></td></tr> • Produces a line of HTML that looks like this: • <tr><td colspan=2> • <a href="http://isiknowledge.com/wos">Web of science citation databases </a></td></tr>
Article Indexes End to End Library Catalogue Daily Report Bib. Records text file FTP’d to web server Browser receives HTML Perl script to reformat Bib. records Auto-load into SQL Database ASP queries SQL and creates HTML pages dynamically
The Future: Customizing Local Interfaces • Library catalogue w/ SQL compliant Database eliminates need for data porting and reformatting. Library Catalogue w/ Oracle SQL Database ASP queries SQL and creates result pages dynamically Browser receives HTML