150 likes | 291 Views
The TARO Project T exas A rchival R esources O nline. Fred Gilmore Sr Operating Systems Specialist UT Austin General Libraries fgilmore@mail.utexas.edu. What It Is . . . A project to make Texas archive and manuscript collection finding aids available through the Web.
E N D
The TARO ProjectTexas Archival Resources Online Fred Gilmore Sr Operating Systems Specialist UT Austin General Libraries fgilmore@mail.utexas.edu
What It Is . . . • A project to make Texas archive and manuscript collection finding aids available through the Web. • “finding aid”: descriptive summary and inventory of a material collection housed at a specific archive; not the materials themselves. • Currently: 1500+ searchable, browsable finding aids, 5000+ hits / day
How it came to be . . . • Two grant funded phases: • Outsourced scanning, OCR, XML tagging of existing paper finding aids • Training/hardware/software for creation of new finding aids • Phase I (2000 – 2001) : 14 participating repositories • Phase II (2002 – 2003) : additional 11 repositories
Alexander Architectural Archive (UT Austin) Center For American History (UT Austin) Benson Latin America Collection (UT Austin) Ransom Humanities Research Center (UT Austin) Texas State Library Texas Tech Southwest Collection/University Archives University of Houston Special Collections/University Archives Rice University Texas A&M Houston Public Library Austin History Center UT San Antonio Texas State University Southern Methodist University UT Medical Branch – Galveston MD Anderson UT El Paso UT Pan American UT Arlington Participating Repositories
How It Came To Be . . . • Why XML? • Compose once, format many • XML and related standards make data exchange/reuse, description easier through separation.
Creating content for TARO • Archives staff: • Edit or compose XML tagged electronic version of finding aid (new finding aids are created using text/XML editor such as Corel XMetaL) • Submit file to UT Austin server
. . <unittitle label="Title:" encodinganalog="245$a"> Thomas J. Rollins Papers, <unitdate type="inclusive" encodinganalog="245$f" label="Dates:" era="ce" calendar="gregorian">1875-1997 and undated</unitdate> </unittitle> <abstract label="Abstract:" encodinganalog="520$a"> The personal papers of Thomas J. Rollins from 1875-1997 and undated. </abstract> <unitid countrycode="us" repositorycode="TxLT-SW" encodinganalog="099" label="Collection #">S 1261.1</unitid> <repository label="Repository:" encodinganalog="852$a"> <corpname> <subarea>Southwest Collection/Special Collections Library,</subarea> . .
Creating Content For TARO • UT Austin technical staff: • XML file is moved into production, error checked, translated into three HTML varieties for viewing. • HTML content is indexed for searching (keyword and fielded), sorted into repository lists for browsing
Advantages • Pages picked up by Google and give content higher visibility. • Multiple views of content including ability to customize view by running the XML document against a personal stylesheet. • Processing fully automated. HTML translated files can be available within hours. • DC metadata and OAI records provide additional access points.
Challenges • Relationships • Mediating local needs with federated site requirements. • Encouraging supplemental metadata creation. • Resources • Introducing improvements without dedicated staff on either end.
Challenges • Realities of the Web • User education. Practically a meta-site. Content expectations not met. • Finding aids can be large. Load times a problem. • XML Unicode requirements make special characters tricky.
Future Plans • Searching: search XML directly • Content: fund the creation, serving of pictures, sound, video • Participation: more repositories = more content • Access: Open Archives, RDF metadata • Flexibility: provide stylesheet for direct XML browsing, PDF creation for hardcopy