310 likes | 610 Views
Flexible and Extensible Digital Object and Repository Architecture (FEDORA). MOA2/Cornell Architecture Meeting December 10, 1998. Sandra Payette Cornell University payette@cs.cornell.edu. http://www2.cs.cornell.edu/payette/presentations/fedora-moa2.ppt. Background - CDLRG. NCSTRL
E N D
Flexible and Extensible Digital Object and Repository Architecture (FEDORA) MOA2/Cornell Architecture Meeting December 10, 1998 Sandra Payette Cornell University payette@cs.cornell.edu http://www2.cs.cornell.edu/payette/presentations/fedora-moa2.ppt
Background - CDLRG • NCSTRL • Dienst architecture • Interoperability strawman proposal (see Leiner paper in next D-Lib) • Open Architecture Research Program • Cornell Reference Architecture for Distributed Digital Libraries (CRADDL) • Flexible Extensible Digital Object and Repository Architecture (FEDORA) • Distributed Searching • Resource Discovery and Metadata (Dublin Core effort, STARTS)
Digital Library Interoperability Cornell Digital Library Library of Congress
Cornell Reference Architecture for Distributed Digital Libraries (CRADDL) • Open Architecture • functionality partitioned into set of well-defined services • services accessible via well-defined protocol • Modularization • promotes interoperability • scalable to different clientele (research library, informal web) • Federation • enable aggregations into logical collections • Distribution • of content (collections) and services • of administration and management of DL
CRADDL: Component-Ware Digital Libraries Handles UI Gateway Service Name Service Collection Services Index Services Digital Objects Repository Services
Repository Service core service to provide a reliable and secure means to store and disseminate digital content interoperability with other CRADDL services Digital Object Model container for aggregating any digital material disseminations of complex content types with rights management global extensibility mechanisms FEDORA Part of our broader effort to develop a component-ware digital library architecture
FEDORA: Conceptual Backdrop • CNRI Digital Object Architecture (Kahn/Wilensky, Arms/Blanchi/Overly) • Warwick Framework • Distributed Active Relationships
FEDORA • DigitalObject:container for content • Structure (raw data structure) • Interface (content views) • Mechanisms (executables) • Repository:logical service • Service layer for “contained” DigitalObjects • Object lifecycle management • Secure environment for running mobile code
Simple, familiar content types Digital Library Content • Complex, compound, dynamic content types
Normalization of digital library content - order the chaos Flexible notions of content while ensuring interoperability Stable interfaces as underlying mechanisms change Naturally evolving content type system - extensibility Community-driven content type development Complex aggregations of distributed content Rights management - leverage existing/future schemes FEDORA: Goals
Multiple “views” of a DigitalObject Future Dublin Core Diary-MOA Book DataStream (MIME-typed byte stream)
Digital Object is... getSection getArticle getTrack getLabel getChapter getPage getFrame getLength recognizable by what it can do
What the client sees vs.What the object is Book Content-Type Interfaces Dublin Core Mechanism Structure
Content Type A set of behaviors that formally describes the functionality of any global or domain-specific notion of content.
Disseminator Primitive Disseminator A generic component for associating a set of behaviors with a DigitalObject. Content Type Disseminator
FEDORA DigitalObject application/ postscript application/ MARC Content-Type Wrapper Primitive Disseminator Structural Kernel
DigitalObject : Client communicates with PrimitiveDisseminator Book Disseminator DublinCore Disseminator GetMethods(Book) application/ MARC ListContentTypes DS1 GetChapter(n), GetTOC(), etc. Primitive Disseminator Book, DublinCore application/ postscript GetDissemination (Book.GetPage(1)) DS2 GetChapter GetTOC GetPage
Content Type Principles • Stability • Orthogonality to Structure • Extensibility These are achieved in FEDORA through the architectural segregation of DigitalObject structure, mechanisms, and content-type interfaces.
FEDORA: Interface Stability Mechanisms can be updated or replaced as technology changes ... Content Type … and content interface to the Digital Object remains stable Mechanism Interface Structure
Digital Object Extensibility: Adding New Content Types Book Photo Collect Photo Collection can be operated on in novel ways… to create new disseminations not originally conceived of for the particular digital object. Book The same underlying data... Structure Mechanism Interface
Content Type Extensibility • There must be a way to identify, register and proliferate content types in the global digital library infrastructure. • Content types must become persistent, named entities in the digital library infrastructure. • How? Content-type definitions and mechanism are disseminated from named DigitalObjects (using FEDORA’s own architectural abstractions).
Content Disseminator is a Generic Component... … that references another FEDORA DigitalObject that disseminates a content-type servlet DataStreams = DS1 ContentTypeID = URNDC1 GetDCField GetDCRecord DC application/ MARC DS1 GetMethods(DC) application/ postscript DS2 GetDCField(e), GetDCRecord
How Achieve Content-Type Extensibility? DC servlet URNDC1 GetDissemination( GetDCRecord) DC Mechanism DublinCore Record Servlet Disseminator URNDC DC MethodList Signature Disseminator DC signature GetDCField GetDCRecord CTID = URNDC1 DC application/ MARC application/ postscript Digital Object attains its extended content-type behaviors through association and delegation
Registration and Proliferation of Content Types • A content type becomes registered when the URN of the DigitalObject that disseminates its signature is registered (in a DL name service) • A content type becomes usable when the URN of the DigitalObject that disseminates its servlet is registered • Other DigitalObjects can utilize content types by referencing them by these URNs.
Access Management • Must have facilities to protect content • No single solution • Association of existing, external rights management schemes • Accommodate new schemes FEDORA applies same extensibility model to rights management ...
AccessManager Mechanisms Servlet Disseminator Disseminator protected by AccessManager URN1 URNACL1 GetDCField GetDCRecord DC ACL Mechanism application/ MARC text/x-acl External Servlet Utilized
Current Status • Full reference implementation • CORBA IDL defines all component interfaces • Java/CORBA prototype system complete • Java client application for building and accessing digital objects • Initial demonstration content types • Dublin Core • Article/Technical Report • Book (with CNRI / Library of Congress) • Photo
CNRI/Cornell Interoperability Project • CNRI and Library of Congress partners • Developed Joint Interface Definition • agreement on all conceptual abstractions • merger of RAP and FEDORA IDL • Separate repository implementations • CNRI using Visigenics ORB • Cornell using Iona’s OrbixWeb ORB • Test collections of Digital Objects • CNRI - Library Congress materials (books, journals, photographs, speeches) • Cornell - NCSTRL research collections
CNRI/Cornell Interoperability Experiments • IT0: Fundamental Communication • Inter-ORB communication • IDL recognition: request invocation; proper return types • Status: Success (October 1998) • IT1: Functional Interoperability • create/access DigitalObjects in each repository • exercise all operations on each other’s repositories • Status: In Progress (completion 12/18) • IT2: Content-Type Servlet Interoperability • dynamic loading and running of remote servlets
FEDORA: Planned Research • Scale up: demonstrate complex content types and servlets with CNRI and LC • Integration of new community-developed content types (e.g., MOA2) • Access Management • Reliability, security, integrity (DLI2 - CS/Cornell University Library) For more information: http://www2.cs.cornell.edu/NCSTRL/CDLRG/FEDORA.html
CDLRG References • Lagoze and Payette: An Infrastructure for Open-Architecture Digital Libraries http://ncstrl.cs.cornell.edu/Dienst/UI/1.0/Display/ncstrl.cornell/TR98-1690 • Payette and Lagoze: Flexible and Extensible Digital Object and Repository Architecture (FEDORA)http://www2.cs.cornell.edu/payette/papers/ECDL98/FEDORA.htmlhttp://www2.cs.cornell.edu/NCSTRL/CDLRG/FEDORA.html • Lagoze and Fielding: Defining Collections in Distributed Digital Librarieshttp://www.dlib.org/dlib/november98/lagoze/11lagoze.html • Distributed Search and Resource Discoveryhttp://www2.cs.cornell.edu/NCSTRL/CDLRG/distsrch.htm