1 / 29

The Fedora Project March 10, 2003

FEDORA is an open-source digital object repository architecture that allows clients to interact with complex, compound, and dynamic objects in a simple and interoperable manner. It addresses the shortcomings of commercial digital library products by providing facilities for managing programs and tools, enabling easy integration of new tools and services, and addressing fine-grained access control and preservation issues.

fgilles
Download Presentation

The Fedora Project March 10, 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Fedora Project March 10, 2003 Sandy Payette Cornell Information Science

  2. Motivation The Problem of Complex Content

  3. Some familiar objects Digital Library Contentnot just documents ... • Complex, compound, dynamic objects

  4. Key Research Questions • How can clients interact with heterogeneous collections of complex objects in a simple and interoperable manner? • How can complex objects be designed to be both generic and genre-specific at the same time? • How can we associate services and tools with objects to provide different presentations or transformations of the object content? • How can we associate specialized, fine-grained access control policies with specific objects, or with groups of objects? • How can we facilitate the long-term management and preservation of complex objects with dependencies on distributed content and services?

  5. Shortcomings of commercial digital library products • Narrow focus on specific media formats (e.g. image databases, document management) • Fail to effectively address interrelationships among digital entities • Fail to address interoperability; no open interfaces to facilitate sharing of services; no standard protocols for cross-system interoperability • Fail to provide facilities for managing programs and tools that are integral to delivering digital content. • Not extensible; does not enable easy integration of new tools and services • Do not address fine-grained access control and preservation issues.

  6. The Flexible Extensible Digital Object Repository Architecture (FEDORA) • DARPA and NSF-funded research at Cornell (1997-present) • CORBA-based reference implementation (Payette/Lagoze) • Extensive interoperability testing (with Arms/Blanchi/Overly) • Policy Enforcement (Payette/Schneider) • Interpreted and re-implemented at U of Virginia (1999-) • Simple web-oriented implementation, focused on access to collections • Java servlet and relational db • Testbed of 10,000,000 objects with performance metrics (1999-2001) • Mellon-Funded FEDORA Software(2002-) • University of Virginia and Cornell - joint development • Open source • Web services and XML • Mediation of distributed services • Preservation focus

  7. The Fedora Architecture Digital Object Model The Repository Web Services

  8. Digital Object Model Container to aggregate digital content of any type Data or metadata Local or distributed “Behavior” definitions (like abstract interfaces) Hooks to external services Enables multiple “disseminations” of content FEDORA Basic Object Architecture

  9. Digital Object Model Functional View dynamic Application services

  10. Digital Object Model Architectural View Globally unique persistent id Persistent ID ( PID ) Public view: access methods for obtaining “disseminations” of digital object content Disseminators Internal view: metadata necessary to manage the object System Metadata Datastreams Protected view: content that makes up the “basis” of the object

  11. Behavior Definition Persistent ID (PID) Data Object Object System Metadata Datastreams Persistent ID (PID) Disseminators Service Definition Metadata (WSDL) System Metadata Datastreams Persistent ID (PID) System Metadata Datastreams Behavior Mechanism Object Service Binding Metadata (WSDL) External Service Digital Object Model Service Relationships

  12. Repository System Object Management Lifecycle (Ingest/create  Store  Delete  Approve  Purge) Validation PID Generation Version management Access Control Preservation support Object Access Object Dissemination Object Reflection Service Mediation FEDORA Basic Repository Architecture

  13. Fedora:A Programmer’s View Understanding the system implementation Web Services Server Design

  14. What is a Web Service? • A distributed application that runs over the internet. • An addressable network endpoint which receives structured messages returns structured responses. • A web application that publishes an open interface through which clients can send requests and received responses.

  15. How is this different from plain old web applications? • Formally defined API (application programming interface) defines a set of abstract operations for a web service • Published bindings for client to run operations • Standard protocol for invoking operations on the service. • XML as standard means of encoding service requests and responses.

  16. Why are Web Services important? • Interoperability • Web applications can interact and build upon each other • Data is transferred in an interoperable manner (HTTP) • Data is encoded in an interoperable format (XML) • Works in decentralized, distributed, operating-system independent environment. • Standards-oriented • Means to expose complex operations with rich data typing (via XML Schema language typing) • Ease of integrating distributed systems via the Web • W3C effort to develop this service architecture

  17. How are Web Services Implemented? • Simple Object Access Protocol (SOAP) • SOAP is a messaging protocol that can run over different transport protocols (e.g., HTTP, SMTP) • Operation oriented (send a request to a end point) • Like CORBA, RMI, DCOM…but for Web and simpler • Application APIs can be defined and published using the Web Service Description Language (WSDL) • Requests and responses sent as XML messages • Supports simple and complex data typing in requests and responses • Supports transmission of binary data within requests or response packages

  18. How are Web Services Implemented? • REST (Representational State Transfer) • URI + HTTP + XML • URI/resource driven; message built into a URL • HTTP GET or POST • Response is XML data • Issues: • Not a standard, but a style of doing web apps; arguably it just gives a fancy name to how lots of people do applications on the web by default; nothing really new here; just argues to do things the way we have been, maybe a little more standard by using XML. • Fragile service definition – URL’s change • No data typing on requests • Limited ability to transmit complex requests on URL • W3C behind SOAP; one strong voice out there for REST (Prescod).

  19. Example of Web Service using SOAP My Application SOAP Request (XML) Google Web Service SOAP/HTTP SOAP/HTTP doSpellingSuggestion(payet) payette SOAP Response (XML)

  20. XML SOAP Request <?xml version="1.0" encoding="UTF-8"?> SOAP-ENV:Envelope xmlns:SOAP-ENV=http://schemas.xmlsoap.org/soap/envelope/ xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance xmlns:xsd="http://www.w3.org/1999/XMLSchema"> <SOAP-ENV:Body> <m:doSpellingSuggestion xmlns:m="urn:GoogleSearch"> <key>/e325JlNPASJu</key> <phrase>payet</phrase> </m:doSpellingSuggestion> </SOAP-ENV:Body> </SOAP-ENV:Envelope>

  21. XML SOAP Response <?xml version="1.0" encoding="UTF-8"?> <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org/1999/XMLSchema"> <SOAP-ENV:Body> <ns1:doSpellingSuggestionResponsexmlns:ns1="urn:GoogleSearch" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <return xsi:type="xsd:string">payette</return> </ns1:doSpellingSuggestionResponse> </SOAP-ENV:Body> </SOAP-ENV:Envelope>

  22. Fedora and Web Services • Fedora Repository system exposed as two related Web services • Access (API-A) and Management (API-M) • Both described using WSDL • Both have SOAP and HTTP bindings • Back-end services • Digital object behaviors implemented as linkages to other distributed web services • Service binding metadata (WSDL) stored in special Fedora objects. • Fedora Repository system acts a mediator to these services.

  23. Fedora: Web Services View

  24. Fedora Server Design 3-Tiered Architecture Modular & Extensible System Diagram

  25. Server Design: 3 Layers

  26. System Diagram

  27. Fedora: Implementation Technologies • Fedora Web Services Layer • Apache Axis for SOAP over HTTP • Apache Tomcat 4.1 • Core Repository System • Sun Java J2SDK1.4 • Xerces 2-2.0.2 for XML parsing and validation • Saxon 6.5 for XSLT transformation • Schematron 1.5 for validation • MySQL-2.23.52 and Mckoi relational database • Deployment Platforms • Windows 2000, NT, XP • Solaris • Linux

  28. DEMO Local Repository www.fedora.info

  29. Deployment Partners • Los Alamos National Laboratory: Research Library • Library of Congress: Motion Picture and Recorded Sound Division • Indiana University: Digital Library group • Kings College London: Humanities Computing • NYU: Humanities Computing • Northwestern University: Academic Computing • Oxford: Oxford Digital Library and The Refugee Studies Center • Tufts: Digital Collections and Archives Department

More Related