230 likes | 256 Views
Explore the latest in PREMIS standards, schema revisions, and implementations discussed at the Implementation Fair in Vienna. Learn about data dictionary changes and the extensibility of the framework. Join the conversation on enhanced preservation practices.
E N D
PREMIS Update Rebecca Guenther Library of Congress rgue@loc.gov PREMIS Implementation Fair Vienna, Austria 22 September 2010
Overview • Editorial Committee membership • What's new since the last PREMIS Implementation Fair (iPRES 2009) • PREMIS Data Dictionary and schema revision process • Changes to the Data Dictionary in process • Schema changes for extensibility • Data Dictionary version 2.1 • PREMIS conformance • Today’s agenda
PREMIS timeline Metadata Framework For Digital Preservation PREMIS Data Dictionary released Maintenance Activity formed PREMIS 2.0 released 2004 2006 2002 2003 2005 2007 2008 2009 2010 PREMIS Working Group formed PREMIS Editorial Committee formed PREMIS Implementation Fairs
The State of PREMIS • de facto standard for preservation metadata; in some countries mandated for cultural heritage repositories • PREMIS implementations are appearing in many places, many contexts, many forms • Some experimentation is leading to changes in the data dictionary and schema • PREMIS Implementation fairs: attempts to consolidate implementation experiences, issues, best practices,
Rebecca Guenther, Chair (Library of Congress) Yair Brama (ExLibris) Karin Bredenberg (Riksarkivet, Swedish National Archives) Priscilla Caplan (Florida Center for Library Automation) Angela Dappert (British Library) Angela Di Iorio (Fondazione Rinascimento Digitale) Markus Enders (British Library) Noreen Hill (Library and Archives Canada) Karsten Huth (Sächsisches Staatsarchiv) David Lake (US National Archives and Records Administration) Brian Lavoie (OCLC) Sally Vermaaten (Statistics New Zealand) Robert Wolfe (MIT/DSpace) Kate Zwaard (US Government Printing Office) PREMIS Editorial Committee membership
PREMIS Implementation Fair at iPres 2009 • State of PREMIS • Tools • PREMIS in METS Toolkit • Univ. of Illinois Hub and Spoke toolkit • Statistics New Zealand toolkit • Systems • ExLibris Rosetta • DAITSS • Potential data model changes • Case studies: implementations • Discussion • How to store environment information • Storing auxiliary files • Exchange
What’s new: PREMIS activities • Integration with other standards and efforts • Survey of PREMIS in METS profiles (DLib magazine Sept 2010) http://www.dlib.org/dlib/september10/vermaaten/09vermaaten.html • Extensibility: Add elements about extensions as in METS • US intelligence community extending for security classification • PREMIS Documentation • Understanding PREMIS: Priscilla Caplan (2009) • Gentle introduction to the PREMIS standard • Spanish, German and Italian translations • PREMIS Data Dictionary for Preservation Metadata version 2.0: translation in Japanese • Workflows and registries • PREMIS Tools to facilitate automated workflows: PREMIS in METS toolkit made available as open source • PREMIS controlled vocabularies in id.loc.gov
PREMIS Data Dictionary and Schema Revision Process • Send change request for consideration by the PREMIS Editorial Committee via Web form or on pigpen wiki • Non-substantive changes will be documented on change page on PREMIS website • Substantive changes will be brought to the PREMIS Implementers’ group • Editorial Committee will discuss within 2 months • Decisions made • Changes made no more than twice a year • Published as addendum to Data Dictionary and/or in revision of XML schema • Community will be informed about changes with reasons made
Changes to Data Dictionary in process (version 2.1) • Correct links • Add linking semantic units from Agent Entity to Events and Rights: • linkingEventIdentifier • linkingRightsStatementIdentifier • Corrections of errors, clarify ambiguous areas • Make storage optional • New agent semantic units • Revision of extension element notes to indicate new attributes • New Agent semantic units: agentNote, agentExtension
Schema changes for extensibility • Add information about extension points modeled after METS • Allow for wrapping or reference of PREMIS metadata • Other attributes: CREATED, STATUS, ID, CHECKSUM, Location type • Include information about metadata type • MDTYPE, OTHERMDTYPE, • MDTYPEURI • Additional work • Coordinate with METS Editorial Board • Define controlled values in id.loc.gov • Revise PREMIS in METS guidelines • Revise notes in Data Dictionary • Draft schema ready to go out for review
Intellectual entities • Has been out of scope and only described by an identifier in PREMIS 1.0 and 2.0 • Development of use cases for giving information about intellectual entities • Consideration of how to implement: as another level of object or a separate entity?
Use cases for describing intellectual entities • Represent a collection, FRBR work, FRBR expression, fonds, series, files (in the archival sense) in order to • capture descriptive metadata • to have business requirements associated with them or to be referenced in business requirements (such as significant characteristics, risk definitions, guidelines for preservation actions, etc.) • structural and derivative relationships • rIghts information • events and agents • Capture versioning information and metadata update events for intellectual lEntities like articles and issues
Adding semantic units for Intellectual Entities • Will be added as another level of object • Advantages to this approach: • Data dictionary will be more compact • Simplify the dictionary by dropping links such as linkingIntellectualIdentifier • Could directly attach to events, agents and indirectly rights to intellectual entities • Next steps • Present to PREMIS Implementers’ Group for review • Revise Data Dictionary and schema
PREMIS conformance Experience in implementation, managing, and using PREMIS semantic units growing Corresponding need to cultivate deeper understanding of what it means to be “PREMIS conformant” Need new conformance statement that is more detailed and more actionable Detailed: precise definition of what conformance means in light of emerging use cases; Actionable: of practical use as resource for assessing conformance of a given PREMIS implementation Subgroup within PREMIS Editorial Committee formed Brian Lavoie, Rebecca Guenther, Priscilla Caplan, Angela Dappert, Sally Vermaaten, Yair Brama
Some “use cases” for PREMIS conformance Inter-repository data exchange e.g., TIPR project Repository certification e.g., TRAC Shared Registries e.g., PRONOM, Unifed Digital Formats Registry Automated workflows/reusable tools e.g., SIP/AIP processing Vendor support e.g., ExLibris Rosetta
New PREMIS conformance statement Establish conditions required for conformance: Articulate what implementers must do to assert PREMIS conformance Describe “degrees of freedom” associated with conformance: Identify areas of implementation decision-making where implementers are free to make their own choices while still remaining conformant http://www.loc.gov/standards/premis/premisConformance_v4.pdf
1. Establish conditions required for conformance Organize, amplify, and extend conformance conditions set forth in Data Dictionary v1.0 and v2.0 Define conformance from multiple perspectives: Level of semantic unit Level of Data Dictionary Internal to repository Inter-repository exchange (import and export) Provide examples of conformance & non-conformance
Examples of conformance: semantic unit • Conformant: A repository uses a relational database system with an Objekteigenschaften table and establishes in the system documentation that Objekteigenschaften shares the definition of the PREMIS semantic unit objectCharacteristics. • Non-conformant: A repository implements a metadata element objectCategory that records information defined in PREMIS semantic units objectCategory and preservationLevel.
Examples of conformance: Data Dictionary • Conformant: A repository that is conformant in regard to Objects also wants to record information about Events; therefore, it implements metadata elements that, at the minimum, capture all of the information specified in the semantic units eventIdentifier, eventType, and eventDateTime. • Non-conformant: The information a repository records about Events does not include information that corresponds to the PREMIS semantic unit eventType
Internal and external conformance • Internal: A repository that satisfies the Principles of Use at both the semantic unit and Data Dictionary levels is considered internally conformant. • External (import): A repository that is import conformant must be able to accept PREMIS-conformant information in the form provided by another repository, parse it, and allocate the information to its corresponding metadata elements in the local repository system, as well as associate it with the appropriate Entities. • External (export): A repository that is export conformant must be able to extract PREMIS-conformant information from its local system, and provide it to another repository in an agreed-upon form, and associate it with its appropriate Entity.
2. Degrees of freedom Naming Repository is free to implement semantic units using names different from those defined in Data Dictionary Granularity Repository is free to distribute information defined in a semantic unit across as many metadata elements as it chooses Level of Detail Repository is free to record more detailed information for a semantic unit than what is defined in Data Dictionary Explicit Recording of Information Repository is not required to explicitly record information for an implemented semantic unit (but information must be recoverable in some way when needed) Use of Controlled Vocabularies Repository is free to use (or not use) controlled vocabularies. If repository uses controlled vocabularies, it can use either internally-defined or external/standardized vocabularies
Next steps for conformance Collect feedback on draft conformance statement from PIG List & PREMIS Implementation Fair participants Finalize draft for approval by PREMIS Editorial Committee Post final version on Maintenance Activity Web site
Data modeling Comparison between PREMIS and PLANETS data models PREMIS OWL ontology PREMIS in interchange Towards Interoperable Preservation Repositories (TIPR) (Priscilla Caplan, Florida Center for Library Automation) ARTAT (Angela Di Iorio, Fondazione Rinascimento Digitale) PREMIS controlled vocabularies PREMIS vocabulary service PREMIS events in HathiTrust Open discussion Today’s topics