1 / 19

Global Digital Format Registry An Update

Global Digital Format Registry An Update. July 2006. Global Digital Format Registry. “The Global Digital Format Registry (GDFR) will provide sustainable services to collect, review, store, discover, and deliver significant representation information about digital formats.”

sheila
Download Presentation

Global Digital Format Registry An Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Global Digital Format RegistryAn Update July 2006

  2. Global Digital Format Registry • “The Global Digital Format Registry (GDFR) will provide sustainable services to collect, review, store, discover, and deliver significant representation information about digital formats.” • Centrally-organized collection and review • Distributed storage, discovery, and delivery via a peer-to-peer network

  3. The GDFR project • Harvard University Library (HUL) funded for 2 years by the Mellon Foundation • Staffing and technical work subcontracted by HUL to OCLC (June 2006) • Project oversight • Steering Committee (SC) for policy oversight • Technical Working Group (TWG) for technical oversight • Active solicitation of the international stakeholder community for review and comment

  4. Deliverables • Functional requirements • Technical specifications • Implementation plan (technology platform) • Inter-nodal protocol • Reference software implementation for nodes • Released under LGPL • Editorial process • Initial population • Succession plan

  5. Schedule • Month 1 Staffing, establish public web site • Months 2-6 Consultation, design, prototyping Public discussion planned for DLF Fall Forum, Boston, November 2006 • Months 7-12 Protocol, node implementation • Months 13-18 Initial population, inter-nodal testing • Months 19-24 Integration testing

  6. What is a format? • “A serialization of an abstract information model” • A set of syntactic and semantic rules for mapping from an information model to a byte stream (and, in most instances, for mapping back) • Encompasses the nominal sense of “file format” as well as a range of conceptual models from the micro to the macro level • IEEE 754 floating point number … File system

  7. GDFR network • Peer-to-peer network communicating over a common protocol • Structured delegation for distribution • DNS analogy • “Root” node • Top-level nodes • Distribution classes • Local data • Unvetted data • Vetted data

  8. Representation Information • Identifiers • Responsibility • Classification • Relationships • Specifications • Signatures • Grammar • Tools • Assessment

  9. Identifiers • Canonical and alias identifiers in a variety of naming systems • Common usage “TIFF” • MIME “image/tiff” • PRONOM PUID “fmt/10” • LC FDD “fdd000022” • Canonical GDFR-defined identifier in the “info” URI scheme

  10. Responsibility • Creator • Owner • Maintenance agency and process • Legal conditions for use

  11. Classification Ontological CLASSES, abstract families, concrete formats, and relationships BYTESTREAM IMAGE STILL RASTER GIF GIF87a GIF89a is-new-version-of GIF87a JPEG ISO 10918-1 JFIF is-subtype-of ISO 10918-1 TIFF TIFF 4.0 TIFF 5.0 is-new-version-of TIFF 4.0 TIFF 6.0 is-new-version-of TIFF 5.0 TIFF/IT is-subtype-of TIFF 6.0 TIFF/IT/CT is-subtype-of TIFF/IT TIFF/IT/CT/P1 is-subtype-of TIFF/IT/CT

  12. Relationships • Subtype ASCII is-subtype-ofUTF-8 UTF-8 has-subtype ASCII • Version TIFF 6.0 is-version-ofTIFF 5.0 TIFF 5.0 has-version TIFF 6.0 • Encapsulation WAVE can-containμ-law μ-law is-contained-by WAVE • Affinity JPEG is-similar-to SPIFF SPIFF is-similar-to JPEG

  13. Specifications • Bibliographic citation, including descriptive (e.g. ISBN) and actionable (e.g. (URI) identifiers • IP considerations probably prohibit the free distribution of specification documents

  14. Signatures • External • Generally indicative • File extension(s) • Internal • Generally dispositive • Magic number • Other well-defined internal syntactic structures

  15. Grammar • Formal notation of a format • Typed to permit multiple parallel formulations, e.g. BNF, ABNF, BSDL, DFDL, EAST • May be feasible only for relatively simple formats

  16. Tools • Services, systems, and tools using formats as inputs or outputs • Described in terms of some functional taxonomy, e.g. edit, transform, render

  17. Assessment • Format-specific risk assessment • Typed to permit multiple parallel formulations • LC Sustainability/Quality & Functionality (SQF) • OCLC INFORM • DSTC PANIC • Cornell Virtual Remote Control (VRC)

  18. General development goals • First create a generalized registry framework, then specialize it for the GDFR application • To the extent that this does not effect other goals and schedules • Platform/network transport independent • Full information content of GDFR is expressible in XML form • GDFR network is re-instantiatable from its XML expression

  19. Related Work • PRONOM www.nationalarchives.gov.uk/pronom/ • Representation Information Registry/Repository dev.dcc.ac.uk/twiki/bin/view/Main/DCCRegRepV04 • LC Digital Formats Web www.digitalpreservation.gov/formats/ • NARA GDFR governance investigation

More Related