670 likes | 812 Views
The Mellon-Funded Fedora Project A Briefing for the Cornell University Library January 24, 2002. Sandy Payette Thorny Staples Ross Wayland. The Mellon Fedora Project. History and Motivation. The FEDORA Open-source Development Project January 24, 2002. Digital Library Projects.
E N D
The Mellon-Funded Fedora ProjectA Briefing for the Cornell University LibraryJanuary 24, 2002 Sandy Payette Thorny Staples Ross Wayland
The Mellon Fedora Project History and Motivation
The FEDORA Open-source Development Project January 24, 2002
Digital Library Projects • Web sites with links to on-line resources • Specific, “boutique” collections • Large collections in one or two area • A broad research collection in all media types and content areas • Ideally, the digital library includes all information
Other Library Services • Electronic Cataloger in the Cataloging Department • Digital Library Research and Development Department • Digital Services Integration (DSI) Coordinator • Digital Library Production Services
Other Services Housed in the Library • The Institute for Advanced Technology in the Humanities • The Virginia Center for Digital History • The Teaching Technologies Initiative • The Media Studies Program Offices
Information Communities Community-oriented resources Richer collections Specialized access and delivery Discipline-specific services
Managing the Collection • Provide a way to universally name all resources without respect to machine address • Track all files for resources, metadata and computer programs consistently • Enforce appropriate policies for use of Library resources • Provide a high level of security • Support preservation activities appropriately
Delivering the Collection • Deliver tools with content • Allow every resource to be used in any number of contexts • Discovery searching across the full collection • Deep searching in particular collections • Move towards a library which aware user’s can configure for themselves
Supporting Digital Scholarship • Supporting the creation of digital scholarly projects • Collecting born-digital scholarly projects • For preservation • Taking over responsibility for primary delivery • Supporting information communities
Metadata • Descriptive – metadata that users use to find things, like traditional library catalog records • Administrative – metadata that the library uses to manage library resources • Structural – metadata about the relationships among resources • Behavioral – computer programs that deliver digital resources to users
The Flexible Extensible Digital Object Repository Architecture (FEDORA) • Developed as an NSF-funded research project at Cornell • Interpreted and re-implemented at UVA • Testbed of 10,000,000 digital objects with very good results • Mellon gave us $1,000,000 to develop a usable system around FEDORA
Repository DevelopmentProject Goals • An efficient, scalable, freely distributable FEDORA repository system ASAP • A complete basic management interface with the initial release • Add important digital library functionality in later releases • Create multiple testbed repositories to deploy and evaluate the software • Make all software open source
Deployment Group • The Digital Library group, Indiana U. • The Humanities Computing group, New York U. • The Digital Collections and Archives Department, Tufts U. • The Humanities Computing group, Kings College London • The Oxford Digital Library and The Refugee Studies Center, Oxford U. • Audio/Video Project, Library of Congress • A library/academic computing group, Northwestern University
Project Plan • Phase 1: Deliver the repository system and the full management interface • Phase 2: Add more production support • Security and policy enforcement • Collection objects • Disk management • Phase 3: Enhance end-user support • Versioning and Editions • Dynamic, Context Sensitive Behaviors • Efficiency and scale optimization
FEDORA Development Project Description: http://fedora.comm.nsdlib.org/
Fedora Architecture Research History and Overview
Management - of distributed digital content and services Access – via stable interfaces to digital objects Interoperability - for digital objects and repositories Extensibility – easy evolution of object behaviors Flexibility - community-defined content models Security - rights management and access control Preservation – of content and “look and feel” FEDORAOriginal Research Goals
Digital Object Containerfor aggregating any digital content Content disseminations based on behavior definitions Extensibility of behavior mechanisms Repository Service layer for “contained” Digital Objects Object lifecycle management Access management FEDORA Basic Architectural Abstractions
FEDORA Digital Object Globally unique persistent id Persistent ID ( PID ) Public view: access methods for obtaining “disseminations” of digital object content Disseminators Internal view: metadata necessary to manage the object System Metadata Protected view: content that makes up the “basis” of the object Datastreams
Persistent ID ( PID ) Persistent ID ( PID ) Image Disseminator Image Disseminator System Metadata System Metadata Datastream - mrsid Datastream – hres jpg Datastream – lres gif Datastream – thumb gif Datastream – tiff master Digital Object InteroperabilityCommon Behaviors for variable content Digital Object #2 Digital Object #1 Functional equivalency
Persistent ID ( PID ) Persistent ID ( PID ) Book Disseminator Book Disseminator Book Photo Disseminator System Metadata System Metadata Datastream Datastream Datastream Photo Collection Datastream Digital Object ExtensibilityAdding New Behaviors Digital Object #3 The same underlying content... to create new disseminations not originally conceived of can be operated on in novel ways…
FEDORA Digital Object Architecture Behavior Definition Object Data Object Persistent ID ( PID ) Method Definition Persistent ID ( PID ) Metadata System Metadata Disseminators Datastreams (specs) Behavior Mechanism Object System Metadata Persistent ID ( PID ) Method Implementation Metadata System Metadata Datastreams Datastreams (executables)
PID PID Disseminators Disseminators System System Metadata Metadata Basis Basis (Datastreans) (Datastreans) FedoraRepository System Management Access Digital Objects with fine-grained access control Storage general-purpose access control
Access ManagementPolicy Enforcement • Semantics of policy language must parallel the behavioral semantics of digital objects • Fine-grained, context-sensitive policies • Extensibility for policies and enforcement mechanisms • Support for portability of digital objects • Decentralized policy management
Access Control Policies • General Purpose • “only repository managers can add new disseminators to digital objects in the repository.” • Object-Specific (“e.g., Lecture object”) • “guests may view course syllabus and slides 1-10 of Lecture 1, but may not view the lecture video or any other slides.” • “students may not view Lecture 2 video unless they submit assignment for Lecture 1.” See research at: http://www.cs.cornell.edu/payette/prism/security/policy.htm
UVA Prototypes UVA Content Models and Demos
Finding Aid Content Model (Finding Aid example)
TEI Letter Content Model (TEI letter example)
TEI Book Content Model (TEI book example)
General Image Content Model (Mycenae image example)
MrSID Image Content Model (Pavilion III image example)
1-bit B/W TIFF Content Model (1-bit B/W TIFF example)
GDMS Content Model (Mycenae example) (lawn example)
Numerical Data Content Model (ICPSR survey example)
FEDORA Specifications – Part I Digital Object Storage
Metadata Encoding and Transmission Standard (METS) • XML “standard” for encoding descriptive, administrative, and structural metadata of digital library objects • Developed under auspices of the Digital Library Federation • METS standard maintained by the Network Development and MARC Standards Office of the Library of Congress http://www.loc.gov/standards/mets/
METS Schema • METS is written in the XML Schema Language • METS defines four sections for an object • Descriptive metadata • Administrative metadata • File group • Structure map • METS goals include: • Facilitate management of objects within a repository • Provide a standard format for exchange of objects between repositories • Provide standard format for transmission of objects to users for rendering (via tools or applications)
Mapping Fedora to METS New in METS
METS : Sample Fedora Object Click here for image digital object
METS: Sample Fedora Behavior Definition Object Click here for Behavior Definition object for DC Click here for Behavior Definition object for UVA_Images
METS: Sample Fedora Behavior Mechanism Object Click here for Behavior Mechanism object for UVA_MARC_DC Click here for Behavior Mechanism object for UVA_Image_STD Click here for Behavior Mechanism object for UVA_Image_MRSID
Fedora Relational Database • Phase 1: Alternate form of object storage to support high-performance access (disseminations) • Repository system replicates from authoritative XML version of objects to relational database • Phase 2-3: Access sub-system works completely off the XML storage, as XML tools improve performance-wise.