200 likes | 319 Views
Organized Digital Library Development … from the Bottom Up. Image courtesy of Life Magazine. Jody L. DeRidder jlderidder@ua.edu. University of Alabama Libraries. Libraries organize information… primarily books.
E N D
Organized Digital Library Development … from the Bottom Up Image courtesy of Life Magazine Jody L. DeRidder jlderidder@ua.edu University of Alabama Libraries
Libraries organize information… primarily books. Trinity College Library, Dublin, as captured by Candida Höfer in her book Libraries (Thames and Hudson ,UK: 2005).
If libraries organize books… Why not digital files?? Photo credit: Flickr user "Libby", used with permission (creative commons) It’s all information!
A digital object may belong in MANY potential virtual collections… • Slavery • African Americans • Sheet Music • Tombigbee River • Southern History • … and more “Gum Tree Canoe,” Published by G.P. Reed (Boston: 1847). Wade Hall collection of Southern History and Culture, Hoole Special Collections, University of Alabama Libraries. … but it originated from ONE SINGLE ANALOG collection. Provenance trumps all!
Bringing Order to Chaos Holder ID: u0003 1) Clarity 2) Low cost 3) Simple 4) Extensible Collection ID: 0000023 Item ID: 0000007 Sequence ID: 0005 Archival File: u0003_0000023_0000007_0005.tif University of Alabama Libraries
HOLDER ID COLLECTION ID u0003_0001980_0000001 is the first digitized item in the MSS 1980 collection
The Digitization Working Area… And a Collection Folder in the Working Area Collection folders are named for the collection identifier. Allowed subfolders include: • Admin • Metadata • Scans • Transcripts Compound objects have their own subfolders for pages, named for the item.
Bringing Content Up to the Level Of the WEB!!! Greater Usability and Access == Longer Life Protected archive area Web accessible area u0003 u0003 0000023 0000023 0000007 0000007 0005 0005 Thumb and large-size derivatives u0003_0000023_0000007_0005.tif Images … ImageMagick: http://www.imagemagick.org (it’s free!) Audio … LAME: http://lame.sourceforge.net OCR … TESSERACT: http://code.google.com/p/tesseract-ocr/
Identification, Organization and Consistency The directory for u0003_0000003_0002_001.tif Each segment of numbers: Holder ID Collection ID Item ID Sequence ID is used in the directory structure. Is simply: u0003/ 0000003/ 0002/ 001/ u0003 slide
Dropping the Technical Metadata in… where it belongs Using FITS, the File Information Tool Set developed by Harvard which encapsulates JHOVE, DROID, ExifTool and other tools: http://code.google.com/p/fits/ Makes METS creation a Piece of Cake! (and redundant!)
Lots of Copies Keeps Stuff Safe!! http://www.lockss.org/ An Example of the Lowest- Cost Model: The Alabama Digital Preservation Network http://www.adpn.org/
Simple, Clear Hierarchical Organization: Holder ID Collection ID Item ID Sequence ID storage area
ACCESS! Via Acumen • XML agnostic • No ingest • No metadata modifications • All content easily accessible • Open to search engines (also free!) http://acumen.lib.ua.edu
Now it’s organized. But can users find what they need? Trinity College Library, Dublin, as captured by Candida Höfer in her book Libraries (Thames and Hudson ,UK: 2005).
Usability Testing * U=Undergraduate, G=Graduate Student, PG=Post graduate volunteer, S=University staff
Remember why we’re here… S. R. Ranganthan (1931), paraphrased: Information is for use. Every user his / her information. Every information its user. Save the time of the user. The library is a growing organism. Image: jscreationzs / FreeDigitalPhotos.net