140 likes | 241 Views
Use of METS in CDL Digital Special Collections Brian Tingle. Co-Library with UC Campus Libraries 59 FTE Organized into Program and Service areas Program areas Bibliographic Services, Collections, Digital Special Collections, Publishing, Preservation Service areas
E N D
Use of METS in CDL Digital Special Collections Brian Tingle
Co-Library with UC Campus Libraries 59 FTE Organized into Program and Service areas Program areas Bibliographic Services, Collections, Digital Special Collections, Publishing, Preservation Service areas Business Services; Assessment, Design & [Web] Production; Data Acquisitions; Information Services (including the CDL Helpline); Infrastructure & Application Support; Project Planning California Digital Library
CDL-T Technology CDL Technologies group no longer a organizational entity, programming resources are now mostly distributed to the program areas “Technical Leads” for program and service areas meet as the “Technology Council” and organize “all hands” meetings for technologists We should have a position posted soon for the Manager of Infrastructure and Application Support; responsible for technology purchases, managing data center relationships, and supervising system and database administrators. See http://jobs.ucop.edu/
4 FTE Responsible for certain CDL services Online Archive of California EAD encoded archival descriptions Calisphere OAC content matched with California curriculum standards Counting California Census and other data UC Image Service Collections for UC instructional use Digital Special Collections
Digital artifacts and their description precede the METS METS provides a tree structure to bind the files of the digital objects together, expressing the relationship between the files and metadata about the files Supports arbitrary XML to encapsulate descriptive information METS documents are not the objects, they are descriptions of the objects Deference to curatorial discretion Digital Object Existentialism
CDL Guidelines for Digital Objects The CDL Guidelines for Digital Objects provides specifications for all new digital objects prepared by institutions for submission to CDL for access and preservation services. They are not intended to cover all of the administrative, operational, and technical issues surrounding the creation of digital object collections. They specify the use of METS, minimum descriptive metadata Specifies requirements for preservation and access “service levels”
METS profiles describe classes of METS digital objects that share common characteristics, such as content file formats (e.g., digital images, TEI texts) or metadata encoding formats (e.g., MODS or Dublin Core). Profiles should include enough details to enable METS creators and programmers to create and process METS-encoded digital objects conforming with a particular profile. A METS profile itself is an XML document that must adhere to the METS XML Profile Schema. For information about METS profiles, see the METS website. METS files must conform to valid METS profiles, which must be declared during pre-submission discussions with CDL staff. METS Profiles (from CDL GDO)
Content Contributors UC Campus libraries, archives, and museums Other public and private California universities Public and private California museums California historical societies California public libraries and library systems
Historically CDL created METS EAD Extraction; semi-automated production from submitted metadata and files New projects use some sort of digital asset system with METS support * GenDB (run by UC Berkeley Library) CONTENTdm (run by califa) and 7train Anticipated future submissions Archivists Toolkit FileMaker based systems * METS by hand from a recent library school graduate who read the CDL GDO METS Creation
For each profile of METS that is supported: test ingest with voro and modify if needed write an XSLT style sheet to extract a Dublin Core record write an XSLT to display the objects Setting up a new profile
eXtensible Text Framework http://xtf.sf.net Integration of Saxon and Lucene + other features Implements “lazy parse trees” which index the start and start of every XML element with a custom DOM. Allows near instant creation of a DOM object without re-parsing very large XML files.
Identifying <div>s in the <structMap> Puzzling practice ORDER=1 ORDER=1 ORDER=2 ORDER=3 ORDER=2 ORDER=3 Optimized for access ORDER=1 ORDER=2 ORDER=3 ORDER=4 ORDER=5 ORDER=6