1 / 39

Bringing Order to Chaos: Preparation and Organization for Long-Term Access

Bringing Order to Chaos: Preparation and Organization for Long-Term Access. Image courtesy of Life Magazine. Jody L. DeRidder . University of Alabama Libraries. ONLINE IS NOT ENOUGH!!. Question:

freja
Download Presentation

Bringing Order to Chaos: Preparation and Organization for Long-Term Access

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bringing Order to Chaos: Preparation and Organization for Long-Term Access Image courtesy of Life Magazine Jody L. DeRidder University of Alabama Libraries

  2. ONLINE IS NOT ENOUGH!! Question: What would it take to reconstruct YOUR digital library in another software system, from scratch? Software Changes... File Formats change... Organizations change their focus and directions. ...and sometimes, we run out of money! Athley, Jake. 2009.“Understanding the Digital Asset Life Cycle.” Widen Enterprises.

  3. Where's the TIFF? Reference to archival file missing in OAI exports Not valid XML! No tiff??

  4. Again, Where's the TIFF? No tiff?? Page-level metadata AND reference to archival file missing in CONTENTdm XML exports ALSO. … Tab-Delimited text export is your only hope of reconstruction.

  5. 10 possible fields in which to find an identifier: 32 different file naming schemes, each with anomalies that did not fit the collection’s own pattern Many metadata files had NO identifiers or ones which did NOT match the filename Identification…?? <object> <object> <url></url> <file></file> <filenb></filenb> <item></item> <filena></filena> <fullrs></fullrs> <rm></rm> <databa></databa> <identi></identi> Sometimes CONTENTdm changed the archival filename on upload…

  6. File storage is a lot like a basement closet... Image courtesy of Teemo, Master of Clowning Image courtesy of Life Magazine What happens when it's time to move???

  7. Bringing Order to Chaos Holder ID: u0003 1) Identification 2) Consistency 3) Organization 4) Documentation Collection ID: 0000023 Item ID: 0000007 Sequence ID: 0005 Archival File: u0003_0000023_0000007_0005.tif University of Alabama Libraries

  8. HOLDER ID COLLECTION ID u0003_0001980_0000001 is the first digitized item in the MSS 1980 collection

  9. (Unambiguous)‏ Identification …depends on US!!! (not the software)‏ Tuscaloosa Service Men's Center Scrapbook, 1943-1946. MSS 1604, William Stanley Hoole Special Collections Library, University of Alabama. u0003_0001604_0000001_0004.tif

  10. Consistency 1800-1860: Hugh Davis Farm Journals Voyages dans l’Amerique Septentrionale Jesse Griffin Letter, 1813 September Nehemiah Denton papers, 1831-1844 F.H. Petrie Letters, 1831-1833   1861-1865: George S Smith Diary Confederate Imprints Sheet music S. R. Norton Letters, 1864-1865  1866-1899: S. D. Cabaniss Papers Joe Wheeler Josiah and Amelia Gorgas Family Papers 1900-1919: Roland Harper Railroad Timetables Central Iron and Coal Daphne Cunningham Diary Eugene Allen Smith

  11. collection linking

  12. CONSISTENCY! In merging collections, you discover all the different metadata variations you have… Item Identifier Filename Identifier Title Other Title Cover Title First Line of Text First Line of Chorus Masthead Title Series Title Special Issue Title from Plate Subject(s) Description Biographic and Historical Note Scope and Content Transcript URL Provenance Funding Information Abstract Creator(s)‏ Arranger(s) Author(s) Composer(s) Conductor(s) Diariest(s) Etcher(s)‏ Instrumentalist(s) Folder Number Plate Number Photograph Number Source Language(s) Relation Published In Digital Collection Repository Repository Collections Is Referenced By Mode of Access Coverage Location Performance Location Place of Publication Recipient Location Sender Location States Served Rights Terms Audience Sorting Number Staff Notes Transcript Object File Name Interviewee(s)‏ Lyricist(s) Photographer(s) Sender(s) Vocalist(s) Work(s) Publisher Digital Publisher Donor(s) Funder(s) Contributor(s) Editor(s) Interviewer(s) Performer(s)‏ Recipient(s) Date(s) Date of Photograph Performance Date Date ISO Type(s) Genre(s) Format Album Number Bibliographic Citation Box Number Call Number Collection Number Container Number

  13. Configure it once... Then copy the config file to the other directories. Collection directory in /contentdbs image supp index etc config.txt Item Identifier:item:TEXT:SMALL:BLANK:BLANK:NOSEARCH:HIDE:NOVOCAB:BLANK Filename:filena:TEXT:SMALL:BLANK:BLANK:SEARCH:NOHIDE:VOCAB:identi Identifier:identi:TEXT:SMALL:BLANK:BLANK:SEARCH:HIDE:VOCAB:identi Title:title:TEXT:SMALL:BLANK:BLANK:SEARCH:NOHIDE:NOVOCAB:title Other Title:other:TEXT:SMALL:BLANK:BLANK:SEARCH:NOHIDE:NOVOCAB:titlea Cover Title:cover:TEXT:SMALL:BLANK:BLANK:SEARCH:NOHIDE:NOVOCAB:titlea First Line of Text:first:TEXT:SMALL:BLANK:BLANK:SEARCH:NOHIDE:NOVOCAB:titlea First Line of Chorus:firsta:TEXT:SMALL:BLANK:BLANK:SEARCH:NOHIDE:NOVOCAB:titlea Masthead Title:masthe:TEXT:SMALL:BLANK:BLANK:SEARCH:NOHIDE:NOVOCAB:titlea Series Title:series:TEXT:SMALL:BLANK:BLANK:SEARCH:NOHIDE:NOVOCAB:titlea cp coll1/index/etc/config.txt coll2/index/etc/config.txt

  14. Capturing ALL the metadata on EVERY level for preservation <mods> <titleInfodisplayLabel="title“> <title> 6th grade class picture </title> </titleInfo> <name type="corporate“><namePart> Ebsco Industries</namePart> <role><roleTerm authority="marcRelator" type="text"> Funder </roleTerm></role> </name> <typeOfResource> Still Image </typeOfResource> <genre authority="bgtchm“> Photographs </genre> <originInfo> <dateCreated> early 1900s <dateCreated></originInfo> <physicalDescription> <extent> 1 photograph : gelatin developing-out paper, black and white ; 5 x 7 in. on mount 5 x 7 in. </extent> </physicalDescription> <note displayLabel="Description“> Jeff Coleman with his 6th grade classmates at Seth Mellew elementary school </note> <note displayLabel="Funding Information" type="sponsorship"> The digitization of this collection was funded by a gift from EBSCO Industries. </note> <identifier type="local" displayLabel="Filename“> u0001_2008002_0000001 </identifier> <subject><geographic> United States--Alabama--Sumter County—Livingston </geographic> </subject> <subject authority="lcsh"> <topic> Coleman, Jefferson Jackson</topic> </subject> <subjectauthority="lcsh"> <topic> Seth Mellew Elementary School </topic> </subject> Archivists Utility translates spreadsheet rows to MODS xml

  15. Organization starts with the working area! mods Before… And after!

  16. A Collection Folder in the Working Area working area • Collection folders are named for the collection identifier. • Allowed subfolders include: • Admin • Metadata • Scans • Transcripts Compound objects have their own subfolders for pages, named for the item.

  17. Consistency and organization are cost-saving. ...and they let you AUTOMATE your work. 

  18. Lots of Copies Keeps Stuff Safe!! http://www.lockss.org/ An Example of the Lowest- Cost Model: The Alabama Digital Preservation Network http://www.adpn.org/

  19. Simple, Clear Hierarchical Organization:  Holder ID  Collection ID  Item ID  Sequence ID storage area

  20. Identification, Organization and Consistency Each segment of numbers:  Holder ID Collection ID  Item ID  Sequence ID is used in the directory structure. u0003 slide

  21. Organization and Consistency Pay Off file org pattern storage area Automated file storage and creation of LOCKSS Manifests: … a VERY good thing!

  22. DOCUMENTATION http://www.lib.ua.edu/wiki/digcoll

  23. Documentation is a wonderful thing… it helps your digital content survive … well into the future. http://www.formatregistry.org

  24. How do you know if your file has been altered? Can you verify that this is the unchanged original? Tuscaloosa Service Men's Center Scrapbook, 1943-1946. MSS 1604, William Stanley Hoole Special Collections Library, University of Alabama. (it’s not that hard)‏ http://www.thefreecountry.com/utilities/free-md5-sum-tools.shtml

  25. Get a CONTENTdm Standard XML Export

  26. California Digital Library 7Train Software http://seventrain.sourceforge.net/

  27. CDL METS Descriptive Metadata is in the dmdSec

  28. California Digital Library 7train on CONTENTdm Standard XML Export… NO Item-level information beyond the title… but LOOK! You get the OCR!

  29. LIVE Links …for web delivery NOT intended for preservation. What good is this in 50 years?? File System

  30. Where’s my JPEG? Where’s my metadata? /contentdbs/{coll}/index/description/desc.all /contentdbs/{coll}/supp/{dmrecord number} (then look up the parent dmrecord number in desc.all)‏ /contentdbs/{coll}/image/ Matching it all up!! Identification is a wonderful thing.

  31. METS documents how files relate to one another in a hierarchical structure… which we already have!!! Holder ID: u0003 ------------------------ Collection ID: 0000001 ----------------------------- Item ID: 0000003 ----------------------- Sequence ID: 0002 ---------------- Sub-Page: 004 ------------- File: u0003_0000001_0000003_0002_004.tif Metadata and Documentation stored at the applicable level

  32. Dropping the Technical Metadata in… where it belongs Makes METS creation a Piece of Cake! (and redundant!)‏

  33. Output → XML Output →

  34. MIX: Metadata for Images in XML http://www.loc.gov/standards/mix/

  35. AudioMD: Audio Technical Metadata http://www.loc.gov/rr/mopic/avprot/

  36. So where does this technical information GO?? • METS has 5 sections: • Descriptive Metadata section: dmdSec • Administrative Metadata section: amdSec • File Group section: fileSec • Structural Map: structMap • Behavior: behaviorSec Put it here! <mets:amdSec> <mets:techMD ID=“MIX1“> <mets:mdWrap MDTYPE="NISOIMG"> <mets:xmlData> <mix:mix> <mix:ImageCreation> Refer to it here! <mets:fileSec> <mets:fileGrp USE="image/master"> <mets:file ID="FID1" MIMETYPE="image/tiff" SEQ="1" CREATED="2003-01-22T00:00:00“ ADMID=" MIX1" GROUPID="GID1"> Don’t forget to add the namespace at the top! xmlns:mix=http://www.loc.gov/standards/mix/ xmlns:audioMD=“http://www.loc.gov/standards/AudioMD/” http://www.loc.gov/standards/mets/METSOverview.html

  37. What’s confusing about this? Simple, Clear, Low Cost, Scalable. That’s a good thing.

  38. Bringing Order to Chaos Holder ID: u0003 1) Identification 2) Consistency 3) Organization 4) Documentation Collection ID: 0000023 Item ID: 0000007 Sequence ID: 0005 Jody L. DeRidder jlderidder@ua.edu jodyderidder.com Archival File: u0003_0000023_0000007_0005.tif University of Alabama Libraries

More Related