300 likes | 463 Views
Flexible and Extensible Digital Object and Repository Architecture (FEDORA). Sandra Payette Cornell University payette@cs.cornell.edu. Dritter Workshop der Digitalisierungszentren, October 5, 1999. http://www.cs.cornell.edu/payette/presentations/fedora-gdz.ppt.
E N D
Flexible and Extensible Digital Object and Repository Architecture (FEDORA) Sandra Payette Cornell University payette@cs.cornell.edu Dritter Workshop der Digitalisierungszentren, October 5, 1999 http://www.cs.cornell.edu/payette/presentations/fedora-gdz.ppt
Cornell Digital Library Research Group • Computer Science Department • Bill Arms • Carl Lagoze • Sandy Payette • Naomi Dushay • David Fielding • Affiliates • Anne Kenney (Cornell Library) • Geri Gay (Human Computer Interaction) • CNRI
CDLRG - Projects • Prism (DLI2) • Fedora • Harmony (IDL) • Dienst and NCSTRL • Electronic Scholarly Publishing • D-Lib • Citation Linking (IDL)
Digital Library Interoperability Cornell Digital Library Library of Congress
Principles for Digital Library Architecture • Open Architecture • functionality partitioned into set of well-defined services • services accessible via well-defined protocol • Modularization • promotes interoperability • scalable to different clientele (library, informal web) • Federation • enable aggregations into logical collections • Distribution • of content and services • of administration and management
Component-Ware Digital Libraries UI UI Gateway Service Name Service Identifiers Collection Service Query Mediator Service Index Service Repository Service Digital Objects
Digital Object Model container for aggregating any digital material disseminations of complex types global extensibility mechanisms access management Repository Service Service layer for “contained” DigitalObjects Object lifecycle management Secure environment open interface FEDORA
Distribution - of digital content and services Interface Stability - for digital objects Interoperability - for digital objects and repositories Extensibility - naturally evolving type system Flexibility - community-driven type development Security - rights management and access control Preservation - longevity of digital objects FEDORA: Goals
FEDORA History • Kahn/Wilensky • Warwick Framework • Distributed Active Relationships • Cornell FEDORA (Lagoze, Payette) • CNRI Repository (Arms, Blanchi, Overly) • CNRI/FEDORA - Interoperability Project • UVA - Complex disseminators, distribution • Project Prism (DLI2)
Simple, familiar entities FEDORA DigitalObjects can be... • Complex, compound, dynamic objects
FEDORA DigitalObject Model Diary Dublin Core Future MIME-typed stream of bytes Book Dissemination Service Request upon external source Internal DataStream Reference DataStream
Disseminator Type getFrame getLength getChapter getPage A set of behaviors that formally describes the functionality of any global or community-specific notion of content. getSection getArticle
Disseminator Primitive Disseminator A generic component that associates a set of behaviors with a DigitalObject. Extensible Type Disseminator Generic behaviors Extended behaviors
FEDORA DigitalObject application/ MARC application/ postscript image/gif image/gif image/gif image/gif Primitive Disseminator
Client communicates with generic requests Book Disseminator DublinCore Disseminator ListDisseminatorTypes GetMethods(Book) GetChapter(n), GetPage(n),GetTOC() Book, DublinCore GetDissemination (Book.GetPage(1)) GetChapter GetTOC GetPage application/ MARC DS1 Primitive Disseminator application/ postscript DS2
A Disseminator... … references a Servlet TYPE DESCRIPTION = DublinCore SERVLET = cornell.dli2/DC-from-MARC … to produce non-generic behaviors for the DigitalObject GetDCField GetDCRecord DC application/ MARC DS1 GetMethods(DC) application/ postscript DS2 GetDCField(Title), GetDCRecord
DigitalObject Interface Stability Servlet-1 Mechanisms can be updated or replaced as technology changes ... Servlet-2 … and the interface to the Digital Object remains stable Servlet-3 Structure Disseminator Type Interface Mechanism
DigitalObject Extensibility:Adding New Types Book Photo Collect Photo Collection can be operated on in novel ways… to create new disseminations not originally conceived of for the particular digital object. Book The same underlying data... Structure Mechanism Interface
Extensibility: a look under the hood DublinCore Mechanism (Servlet) DC servlet URNDC1 GetDissemination( GetDCRecord) DC Mechanism DublinCore Record Servlet Disseminator URNDC DublinCore Disseminator Type Signature (Interface Definition) DC MethodList Signature Disseminator DC signature GetDCField GetDCRecord Servlet = URNDC1 DC application/ MARC application/ postscript
Proliferation of Disseminator Types • We use FEDORA DigitalObjects to store Disseminator Signatures and Servlets. • Type Registration (via name service) • a Disseminator Type’s global identifier is … the URN of a DigitalObject containing a Signature • a Servlet’s global identifier is … the URN of a DigitalObject containing a Servlet Types can be globally recognizable and mechanisms can be shared.
Interoperable Digital Objects and Repositories Repository Repository RAP Client Name Service Repository Identifiers Audio/Visual Archive Cornell Library Collections Image Database System
Persistent Identifiers Name Service Identifiers • In FEDORA, use them for: • Repositories • DigitalObjects • Disseminator Types • Servlet Mechanisms • Benefits: • Ensure uniqueness • Provide stability (location independence) • Promote global extensibility • Promote interoperability
Identifiers - A Brief Primer IETF Uniform Resource Name (URN) Spec • Naming Scheme • The policies and procedures for creating and assigning URNs within a particular domain. • Resolution System • A system that translates URNs into their location-specific identifiers (e.g., URLs). • Registries • A set of global directories that provide information on which resolution systems can translate any particular URN.
Identifiers - Existing Solutions • CNRI’s Handle System • good implementation of URN specification • 1 Handle >> one or more locations • resolve to different data types (URL, IOR,…) • OCLC’s PURL • persistent URLs, not really URNs • 1 PURL >> only one location (a HTTP redirect) • Community-specific Initiatives • Digital Object Identifier (DOI) - publishers • Handle System + Rights Metadata • PubMedID - Medline • BibCode - astro-physics journals
FEDORA Status • Reference Implementation • CORBA IDL defines open interfaces for Repository Access Protocol (RAP) • Java/CORBA repository and clients • Collaborations • CNRI • core design and interoperability • complex disseminations (dynamic) • U of Virginia • web integration • complex disseminations (e.g., e-texts)
New Research • DLI2 - Project Prism • security (associating enforceable policies and mechanisms with DigitalObjects) • preservation (enable long-term survival of DigitalObjects in distributed environment) • IDL - Harmony • aggregation and interaction of multiple, complex metadata sets in DigitalObjects • RDF and XML
PRISM Security Policy Enforcement • Challenges • what is enforceable? • distributed object environment • interoperability and extensibility • Monitor all operations, generic and extended • Enforce a wide array of policies • basic security violations • rights management • access control GetDCField GetDCRecord DC application/ MARC text/x-acl
PRISM: Preservation Handles Fedora Repositories Preservation Service
PRISM: Preservation Policy Enforcement Preservation Surrogate Object Monitors DigitalObject state and catches unacceptable, or risky transitions Preserve Book P DS1 preservation metadata Preservation Service application/ postscript DS2
References • Payette, Blanchi, Lagoze, and Overly: Interoperability for Digital Objects and Repositories: The Cornell/CNRI Experiments, D-Lib Magazine, May 1999.http://www.dlib.org/dlib/may99/payette/05payette.html • Payette and Lagoze: Flexible and Extensible Digital Object and Repository Architecture (FEDORA), ECDL 1998.http://www.cs.cornell.edu/payette/papers/ECDL98/FEDORA.html • Lagoze and Payette: An Infrastructure for Open-Architecture Digital Librarieshttp://ncstrl.cs.cornell.edu/Dienst/UI/1.0/Display/ncstrl.cornell/TR98-1690 • Daniel, Lagoze, and Payette, A Metadata Architecture for Digital Libraries, IEEE ADL 1998.http://www.cs.cornell.edu/lagoze/papers/ADL98/dar-adl.html • FEDORA Home Page http://www.cs.cornell.edu/NCSTRL/CDLRG/FEDORA.html • Payette: Persistent Identifiers on the Digital Terrain, RLG DigiNews,April 1998, Volume 2, Number 2. http://www.rlg.org/preserv/diginews/diginews22.html