1 / 50

Introducing “Pergamos”

European FEDORA User Meeting Copenhagen, 28 September 2005. Introducing “Pergamos”. A FEDORA-based Digital Library System utilizing Digital Object Prototypes. Kostas Saidis saiko@di.uoa.gr. Libraries Computer Center Department of Informatics & Telecommunications University of Athens.

hoang
Download Presentation

Introducing “Pergamos”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. European FEDORA User MeetingCopenhagen, 28 September 2005 Introducing “Pergamos” A FEDORA-based Digital Library System utilizing Digital Object Prototypes Kostas Saidissaiko@di.uoa.gr Libraries Computer Center Department of Informatics & Telecommunications University of Athens

  2. Outline • Motivation – The University of Athens (UoA) DL • Digital Objects (DOs) • DO Storage (FEDORA) • DO Manipulation (DL Application Logic) • Digital Object Prototypes • Automatic DO Type Conformance • Scope of Prototypes & Collection Management • Implementation Details • A Preview of Pergamos • Discussion

  3. The UoA DL Project • Over 1 million objects originating from 8 disparate collections • Folklore notebooks, Ancient papyri, UoA Historical Archive, Byzantine music manuscripts, Theatrical photos & brochures, Informatics research papers and dissertations, Medical images, Press articles • Heterogeneous material, in terms of content type, metadata, structure, user requirements • Mostly digitized material, requiring detailed cataloging

  4. UoA DL Project Metadata • Build a Web-based DL System to handle all material • Centralized DL approach due to • Existing hardware infrastructure • Funding restrictions • Administration simplicity • FEDORA is our DO Repository

  5. UoA DL Project Metadata Contd. • Small Team • 2.5 developers, 1 librarian, 1 manager • Requirements, Specifications, Development, Digitization & Cataloging Management … • … while everyday tasks keep running! • Cataloging Personnel • Scholars & Experts in each collection’s domain (not librarians) • Strict Schedule • First Collection deadline: early 2006 • Project deadline: end of 2006

  6. Motivation • Simplify & speed up the cataloging process • Provide effective Web-based cataloging interfaces • Automate content ingestion • Decrease development time • Avoid custom coding for each content variation • Elaborate on reusable and configurable DL modules • Provide the means to treat content variations in a unified manner

  7. Digital Objects • A Digital Object is a human generated artifact consisting of the digital content and related information

  8. FEDORA • FEDORA Digital Object Model • Content Models, Datastreams, Behavior Definitions, Mechanisms & Disseminators • FEDORA is a DO Repository • Focus on how each DO part is encoded & stored • Handles effectively issues related to storage, preservation & versioning, searching & indexing, interoperability

  9. Traditional 2-tier Approach

  10. DL Application Logic • Cataloging, Workflows, Collection Building & Management, User Interfaces, etc • DL Modules manipulate DOs in a higher level of abstraction • Focus on the overall behavior of the DO (what are the DO parts and how do they behave) • DOs reflect the underlying “real world” objects – they behave according to their nature, their essence, their type

  11. DO Typing information Do we effectively capture, express and utilize the nature (type) of DOs?

  12. An example – Theatrical Collection • Albums containing photos of National Theater Performances • What is a Photo DO? • A digital image • stored in various formats (e.g high quality, www quality, thumbnail) • accompanied by the metadata required for describing the picture • What is an Album DO? • A container of Photo DOs accompanied by theatrical play metadata

  13. A 2nd example – Historical Archive • University’s Senate Session Proceedings > Folders > Sessions > Items • What is a Item DO? • A digital image (capturing 1 or 2 pages) • stored in various formats (e.g high quality, www quality, thumbnail) • What is a Session DO? • A container of Item DOs + metadata • What is a Folder DO? • A container of Session DOs + metadata

  14. DO Typing Information • FEDORA Content Models express DO Typing information • Content Models are metadata attributes (e.g. “photo”, “album”) that we use as a guide • Humans interpret Content Models, not the DL System • Manual resolution of DO Typing issues

  15. Problems • Catalogers carry out manual XML editing in a low level of abstraction with too technical, complex & over detailed semantics • Developers generate ad-hoc, custom & not reusable implementations of DO types’ variations of behavior • DL modules exhibit limited evolution and configuration capabilities

  16. DO Typing Information The DL System should resolve DO Typing issues automatically (in a manner transparent to the DL Application Logic)

  17. Automatic DO Type Conformance • The designer specifies the various DO types… • … and the DL System makes DOs conform to these type specifications automatically • How?

  18. By drawing on the notions of OO

  19. The OO Viewpoint • In the OO model an object is itself aware of its “nature” and behaves accordingly • Objects are conceived as instances of a type, automatically conforming to the type’s definitions & specifications • OO types are separate entities (named either classes or prototypes)

  20. Digital Object Prototypes • A DO Prototype is a DO Type Specification, a separate entity that defines the DO’s: • Constitutional parts – metadata sets, files, structure, etc • Private behaviors – DO internal operations such as serializations, validations, assignment of default values, content conversions, etc • Public behaviors (behavior schemes) – the DO external interface, consisting of high level operations such as Detail view, Browse View, Edit View, etc

  21. OO Encapsulation

  22. Photo Prototype & Instances

  23. DO Prototypes & Instances • The designer carries out the definition of DO Prototypes – the DL System handles the rest • DO Prototypes represent the realization of the Content Model notion in a OO fashion: • The process of generating a DO from a Prototype is called instantiation • The resulted object is an instance of the prototype • A DO instance automatically conforms to the Prototype’s specifications • Stored DOs vs DO instances

  24. 3-tier DL Architecture

  25. Digital Object Dictionary • The runtime environment in which DO instances and Prototypes operate: • Instantiation of DOs based on the prototype specifications (private behaviors: load & parse XML, assign default values, etc) • Exposure of the public DO behaviors in a high level, uniform API (for use by DL Modules) • Serialization of the DO instance back to FEDORA (private behaviors: serialize data structures in XML, perform validations, etc)

  26. A DL Module performs the following steps: Acquire the DO Instance do = dictionary.acquireObject(“type”) do = dictionary.acquireObject(“uoadl:1024”) Perform operations upon it do.getMDSet(“DC”).getField(“title”) dictionary.executeBehavior(do, “editView”) Store the DO in the repository dictionary.saveObject(do) Cleaner, simpler, more effective Expression of DL Application Logic

  27. 3-tier DL Architecture Separation of Concerns

  28. 3-tier DL Architecture Separation of Concerns Storage

  29. 3-tier DL Architecture DO Typing & Instantiation Separation of Concerns Storage

  30. 3-tier DL Architecture Composition of DO behaviors DO Typing & Instantiation Separation of Concerns Storage

  31. Pergamos If it sounds like Greek…

  32. Scope of Prototypes • Should we have global DO Types? • Collection-pertinent types: A DO Prototype is defined in the context of a Collection • Support fine grained definition of collection specific kinds of material • Hierarchical naming scheme for types • Theatrical Collection Photo: dl.theatre.photo • Medical Collection Photo: dl.medical.photo • Stored in the “contentModel” metadata attribute • Avoid type collisions

  33. Album Prototype & Instances

  34. Collection Management • DL = Hierarchy of DO instances • Collections are also DOs • The DL itself is a DO, representing the “super-collection” (the collection of all the collections) • Easily add new collections & sub-collections • All content is modeled in a unified manner & can be characterized • Allow the DL designer to work out the details of each collection independently, yet in a uniform manner

  35. DL as a Hierarchy of DO instances

  36. Implementation details • DO Prototypes are • Specified in XML form • Stored in the “TEMPLATE” datastream of the appropriate Collection DO • Loaded, parsed & interpreted by the DO Dictionary in its bootstrap procedure • Transparent to FEDORA • DO Instances are supplied with the “CONTAINER” datastream, containing the pids of the DOs they “contain”

  37. DO Prototypes in detail • MD Sets • Specification of each individual field (label, description, multi-value, mandatory, UI characteristics) • Serialization information (how to store it in FEDORA) • Field mappings (under development) • Files: Automatic conversions (tiff -> jpeg + thumb) • Batch Import: automatically create Dos from zip bundles • Structure: allowed children types • Browsers: browse field • Indices: e.g. subject catalog • Behavior schemes: atomic DO elements

  38. Discussion

  39. Pergamos • Historical Archive (production) • Folklore Notebooks (testing) • Theatrical Collection, Medical Images & Byzantine music manuscripts (finalization of requirements & specifications) • Undergoing development … the remaining collections are coming next • Historical Archive will be published on early 2006… • … with a multi-lingual UI, hopefully!

  40. Public DO Behaviors

  41. Future Work • Fully implement the OO paradigm • OO Inheritance for DO Prototypes (e.g the Notebook type derives from the Book type) • OO Polymorphism for DO instances (e.g the DO “uoadl:1234” is both a Notebook & a Book) • Supply general purpose linking capabilities that exceed structural relations (FEDORA Metadata for Object-to-Object Relationships?) • Deliver on schedule…

  42. Conclusions • If in doubt, use FEDORA • Flexible & Extensible (they mean it) • 1 year of Pergamos development, 2 months of testing & 3 months of production use (Historical Archive) with no serious problems • Though, Sandy & Carl, I’d be grateful for some minutes of your time!!! • DO Prototypes: a realization of Content Models in OO terms, implemented on top of FDOM to handle DO Typing issues automatically • Detailed report on Pergamos to appear…

  43. Thank You • Questions? • Comments? • For details: • "On the Effective Manipulation of Digital Objects: A Prototype-based Instantiation Approach"Kostas Saidis, George Pyrounakis, Mara Nikolaidou, Proc. 9th European Conference on Research and Advanced Technology for Digital Libraries, ECDL 2005, Vienna, Austria, September 2005 • email: saiko@di.uoa.gr

More Related