290 likes | 466 Views
Fedora New Features, New Collaborations, Bright Future. Fedora Users Conference Copenhagen, Denmark September 28, 2005. Sandy Payette Co-Director Fedora Project Cornell University. Fedora Brief History. Cornell Research (1997-present) DARPA and NSF-funded research
E N D
FedoraNew Features, New Collaborations, Bright Future Fedora Users Conference Copenhagen, Denmark September 28, 2005 Sandy Payette Co-Director Fedora Project Cornell University
Fedora Brief History • Cornell Research (1997-present) • DARPA and NSF-funded research • First reference implementation developed • Interoperable Repositories (experiments with CNRI) • Policy Enforcement • First Application (1999-2001) • University of Virginia digital library prototype • Technical implementation: adapted to web; RDBMS storage • Scale/stress testing for 10,000,000 objects • Open Source Software (2002-present) • Andrew W. Mellon Foundation grants • Technical implementation: XML and web services • Fedora 1.0 (May 2003) • Fedora 2.0 (Jan 2005) • Fedora 2.1 (coming soon!)
Cornell University Sandy Payette (co-director) Chris Wilper Carl Lagoze Eddie Shin University of Virginia Thorny Staples (co-director) Ross Wayland Ronda Grizzle Bill Niebel Bob Haschart Tim Sigmon Fedora Development Team
“Fedora Inside” Known Use Cases • Digital Library Collections • Institutional Repository • Educational Software • Information Network Overlay • Digital Archives and Records Management • Digital Asset Management • File Cabinet / Document Management • Scholarly publishing
Fedora Repository and Web Services Web Services Exposure RDF files rdbms
The Basics: Fedora Digital Object Model Container View Digital object identifier Persistent ID (PID) Relations (RELS-EXT) Reserved Datastreams Key object metadata Dublin Core (DC) Audit Trail (AUDIT) Datastream Datastreams Aggregate content or metadata items Datastream Disseminators Pointers to service definitions to provide service-mediated views Default Disseminator Disseminator
Fedora – Object Model XML • FOXML (Fedora Object XML) • Simple XML format directly expresses Fedora object model • Easily adapts to Fedora new and planned features • Easily translated to other well-known formats • Enhanced Ingest/Export of objects • FOXML, METS (Fedora extension) • Extensible to accommodate new XML formats • Planned: METS 1.4, MPEG21 DIDL
2.1 Release Notes • Authentication plug-ins • HTTP Basic auth • Tomcat realms and login modules • Plug-in #1 : Tomcat user/password file or database • Plug-in #2 : LDAP tie-in • Plug-in #3 : Radius Authentication • Support for SSL • Authorization module • XML-based policies using XACML • Repository-wide policies • Object-specific policies • Fine-grained policy enforcement • API actions X subject attributes X object attributes
XACML Policy Examples • Repository-wide Policy • [xacml-1] Deny access to DC datastream to specific user group • Object-specific Policy • Deny all access to the object “cornell:cs100” if user is a not a Cornellian. • Genre-oriented Policy • [xacml-2] For objects with content model of “uva-image”, permit students access to disseminations, but deny them access to raw datastreams, but allow professors access to both. • Time-oriented Policy • Permit students access to “answers” datastream of learning object cs:125 after May 15, 2005 • Backend Service Security Policy • Deny callback by the external MRSID service identified as “bmech:10”
2.1 Release Notes • Review of RDF-based Resource Index • “Relationships” Datastream • Ontology of common relationships (RDF schema) • RDF stored in datastream identified by “RELS-EXT” • Resource Index (RI) • RDF-based index of repository (automatic indexing into Kowari triple-store)) • Graph-based index includes: • Object properties and Dublin Core • Object-to-object relationships • Datastream Disseminations (and properties) • RI Search (Search the repository as a graph) • Powerful querying of graph of inter-related objects • REST-based query interface (using RDQL or ITQL) • Results in different formats (triples, tuples, sparql)
2.1 Release Notes • New in Fedora 2.1 for Resource Index • Resource Index corruption problems diagnosed and fixed (Kowari memory bug) • Minor RI model changes (may require modification of existing static queries by users • Relaxation of validation rules on RELS-EXT: now accepts ( objectURI --- relation/property --- > URI/literal) • Method Disseminations (and properties)with option for method X parmVal permutations • Scale and Performance Testing (NSDL 2M objects, >100M triples) • Sesame support for triplestore
RI: Fedora Objects RDF Graph view Member Object Collection Object
Fedora 2.1 Release Notes • PROAI Server (Advanced OAI Provider) • Harvest multiple metadata formats • Harvest datastreams and disseminations • Support for incremental harvest by modified date • Support for OAI sets • Highly configurable via queries against Resource Index • Directory Ingest Service • Facilitate ingest of hierarchical directories of files • Submit files as .zip or .jar (with a METS manifest) • Automatically asserts parent-child relationships in RELS-EXT • Stages content and ingests as FOXML objects into repository • Directory Ingest Client • Web client (signed applet) • Browse directory trees, select dir/files, add metadata, add relations • Auto-generates METS manifest for entire collection • Packages as zip/jar and ingests into Fedora repository
2.1 Release Notes • Rebuild Utility for Repository Indices • Improved logging using log4j • Trippi.log • Kowari.log • Repository log • Handle System Plug-in for PID Generation • Command-line utility syntax changes • New Command-line utilities • fedora-reload-policies • validate-policy • fedora-rebuild • FedoraClient utility class for building new clients
You asked… • “We wish for a out-of-box” end-user client for Fedora.” • “Can’t you put the DSpace interface on top of a Fedora repository?” • “We need something to show people Fedora right away (before we get $$ for development resources).” • “We love Fedora. It would be really great if you distributed a default end-user client.”
The Answer: FIRE Client • Web-based client for “institutional repository” • End-user content submission • Object creation template for “content models” • Configurable Workflows • XACML policies coordinated with workflow • Search/Browse collections Development in progress!
Fedora Development Priorities2006-2007 • Fedora Framework Services • Federated Repositories • “Fedorations” with name service • Federation with other repositories (DSpace, aDORE, arXiv) • Cornell/LANL NSF Pathways project • InterDisseminator • “Content Model” Specification Language • Advanced Object Creation Workbenches • Tools for RDF browse and graph traversal • Scalability/Performance – very large repositories • Web services security and Shibboleth • Code Refactoring • Fedora as web app (.war) • Fedora Showcase and News (on new website) • Community Coordination and Co-Development
Collaboration: Fedora Community Working Groups • Preservation Working Group (Ron Jantz, Rutgers) • Requirements for preservation services • Define service APIs and technical integration with Fedora 2.1 + • Preservation metadata recommendations for Fedora • Prototyping of new services • Development plan for deployment of new services
Collaboration: Fedora Community Working Groups • Workflow Working Group (Peter Murray, OhioLink) • Sep 05: WORKFLOW WG chartered and begins work • Oct 05: Submit "terminology and problem statement" document to fedora-users for review • Nov 05: Submit modeling diagrams, workflow process descriptions, and recommendation for workflow engine to fedora-users for review • Feb 06: Release alpha-quality version of ingestion workflow engine • Apr 06: Release beta-quality version of ingestion workflow engine • Aug 06: Release production-quality version of ingestion workflow engine • Nov 06: Revise documents based upon implementation experience • Feb 07: Release alpha-quality version 2.0 of ingestion workflow engine • Apr 07: Release beta-quality version 2.0 of ingestion workflow engine • Aug 07: Release production-quality version 2.0 of ingestion workflow engine • Sep 07: Close or recharter the WG
Sample Workflows Ingest-oriented process Ingest to Repo Assign Access Policy Validate byte- streams Index and Register Link to Simulation Service SIP Review-oriented process Review Review Assign Policy Submit Publish Edit thesis Ingest To Archive Preservation-oriented process Format Migration Make Copies Diagnose Problems Object Versioning In Repo Ingest To Archive Digital Object
Collaboration: Fedora Community Working Groups • Outreach Working Group (Linda Langschied, Rutgers) • Improve content of Fedora web site • More user-oriented information (currently technical focus) • Community Showcase – demos, graphics • Survey database with simple web form to profile users • Collaboration Environment • Wiki, Confluence, other? • Content Model Working Group (under charter) • Formalization of notion of Fedora content model • XML schema to define content models • Investigate ontology-based content model definition • Round up existing content models and publish to promote reuse
Fedora Community • Fedora Advisory Board • Vision • Commission Working Groups • Prioritize Development • Define Sustainability Model • Collaborative Development Opportunities • Share Tools via www.fedora.info • User-contributed Tools, Apps, Services
Fedora Community (a sampling) • General questions • Hot topics • Workflow • Digital object typing • Rdf and relationships • Search and indexing • Collaboration models • other • Demos • Encylopedia of Chicago • NSDL