150 likes | 299 Views
The Fedora Digital Repository Project and the National Science Digital Library (NSDL). July 26, 2005. Dean B. Krafft Cornell University. Fedora: Repository Middleware. A F lexible, E xtensible D igital O bject R epository A rchitecture
E N D
The Fedora Digital Repository Projectand theNational Science Digital Library (NSDL) July 26, 2005 Dean B. Krafft Cornell University
Fedora: Repository Middleware • A Flexible, Extensible Digital Object Repository Architecture • An architecture and toolkit (like IIS or SQL Server), not a vertical application • Audience: system builders – 12 major university or national (Denmark) digital libraries • DSpace in contrast: a vertical application with a fixed workflow targeted at users • So far incorporated in two commercial products: VTLS’s Vital digital library, and Company X’s product – finalist for a large government contract
Fedora: Project Details • Collaboration of Cornell and UVa • Development team of 10 developers+leads • Currently implemented in Java; licensed under Mozilla Public License • Funded by Mellon: starting 2nd 3yr $1.4m grant • Cornell leads: Sandy Payette & Carl Lagoze • 20,000+ downloads, active user community • Use cases: Digital Asset Management, Scholarly Publishing, Information Network Overlay, Institutional Repository, Digital Archive and Records Management, Digital Library
Fedora Digital Object Model Component View Digital object identifier Persistent ID ( PID ) Relations (RELS-EXT) Reserved Datastreams Key object metadata Dublin Core (DC) Audit Trail (AUDIT) Datastreams Set of content or metadata items (local or external URL redirects) Datastream Datastream Disseminators Web-service methods for distributing views of recombined content Default Disseminator Disseminator
Fedora Repository Service • Set of SOAP/REST services: Manage, Access, Search, Query • Fundamental store is XML, with RDBMS cache (Oracle, MySQL), and RDF triple store for relationship queries • Modular architecture: Manage, Access, Storage, Dissemination, Authentication, Authorization, RDF Resource Index
Fedora 2.0 Capabilities • Object-to-object Relationships • Ontology of common relationships (RDF schema) • Relationships stored in special datastream (RELS-EXT) • Resource Index (RI) • RDF-based index of repository (Kowari triple-store) • Graph-based index includes: • Object properties and Dublin Core • Object Relationships and Object Disseminations • Powerful querying of graph of inter-related objects • REST-based query interface (using RDQL or ITQL) • Results in different formats (triples, tuples, sparql) • Fedora 2.1 (August 2005) adds • Plug-in Authentication modules • Fine-grained Authorization using XACML XML-based policies
National Science Digital Library • K-gray Science, Technology, Engineering, and Mathematics (STEM) education • NSF-created brand and home for digital resources of known high quality • Community of users, contributors and institutions (as providers and consumers) • Creates context for resources (e.g. lesson plans, standards alignment, ratings, annotations, reviews, brands) • Guides selection & use; not just discovery
Program Details • Major NSF Division of Undergraduate Education program, over $20m/yr funding • Over 120 NSF grants in program • Core Integration collaboration of UCAR, Columbia University and Cornell University • Cornell provides core technical infrastructure: Fedora-based repository, Lucene-based search, nsdl.org portal • Columbia: Shibboleth authentication; SDSC: Storage Resource Broker archive
What Fedora Provides NSDL • Objects: Aggregators (collections), Metadata Providers, Agents, Resources (with local or remote content), Metadata • Relationships: Structural (part of), Equivalence, Membership, arbitrary graph queries • Network overlay architecture: A lens for viewing science content on the net, whether content is local, remote, or archived – it all has a repository-based URL • Web services: disseminations are arbitrary recombinations of content • Authentication/Authorization: Collections and services manage their own repository content
Appendix – Additional Information • Fedora website: http://www.fedora.info • NSDL website: http://nsdl.org • An Information Network Overlay Architecture for the NSDL by Lagoze, Krafft et al.: http://www.arxiv.org/abs/cs.DL/0501080 • Fedora: An Architecture for Complex Objects and their Relationships by Lagoze, Payette et al.: http://www.arxiv.org/abs/cs.DL/0501012
Selected Fedora Adopters • Current Users: • National Science Digital Library (NSDL): Core Integration • University of Virginia – digital library • VTLS – library systems vendor selling Fedora-based product • Tufts University – digital library and university records management • OhioLink – statewide consortium of academic libraries • Northwestern: Library and Academic Technologies – digital library • ARROW: National Library of Australia and Monash University – nationally distributed institutional repository project • Royal Library Denmark, National Library, and DTU – integrated national digital library • Rutgers University – digital library • Indiana University – digital library • American Geophysical Union – repository of back issues of journals • Library of Congress – National Digital Newspaper Project • University of Delaware – digital library • Hamilton College – digital library • Cornell CIT – Electronic File Cabinet to manage office records • Tibetan Buddhist Resource Center – digital library • Yale University – manage university records • DISA – South Africa, History of Apartheid resistance – record repository • Interesting new proposals • Company X finalist for large government contract • Cornell Lab of Ornithology (data + tools + documents)
Fedora Development Consortium • Advisory Board • University of Virginia • Tufts • VTLS • ARROW (Monash University and Nat’l Lib Australia) • Harris Corp. • Danish Royal Library and DTU • Northwestern University • NSDL – Core Integration • Mission • Requirements Definition, Specifications. Joint Development • Commission of Working Groups • Content Modeling • Outreach and Education • Workflow and Service-Oriented Processes • Recommendation for Long-Term sustainability model • Governance and Funding • Set Fedora Free – full open source model (e.g., public SourceForge) • Code Maintenance (UVA until 2012; plan for beyond)