1 / 32

Dienst Distributed Networked Publishing

Dienst Distributed Networked Publishing. Carl Lagoze Digital Library Scientist Cornell University. Cornell Digital Library Research Group (CDLRG). Research and Development of Component-Ware Digital Library Infrastructure

dinah
Download Presentation

Dienst Distributed Networked Publishing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DienstDistributed Networked Publishing Carl LagozeDigital Library ScientistCornell University

  2. Cornell Digital Library Research Group (CDLRG) • Research and Development of Component-Ware Digital Library Infrastructure • Developed out of DARPA-funded Computer Science Technical Reports Projects (CS-TR)

  3. Component-Ware Digital Libraries • Service-based infrastructure • Interface (protocol) of each service • Interactions between services • aggregations into logical collections and libraries • Layered approach accommodates requirements of varying clientele • research libraries - high-integrity, quality of service, security • informal collections - e.g., web

  4. CDLRG Research Projects • FEDORA • Distributed Searching and Resource Discovery • Digital Library Collection Definition • Metadata (Dublin Core and Warwick Framework) • Networked Computer Science Technical Reports Project (www.ncstrl.org)

  5. A Production Digital Collection A Vehicle and Testbed for Digital Library Interoperability A Vehicle for Exploring Policy and Organization What is NCSTRL?

  6. A Production Digital Collection • A growing collection of CS research reports • A service relied on by users and publishers • Motivates solving hard, real-world problems: IPR, quality of service, federation of publishers

  7. A Testbed for Technology • Create a modular system based on a standard open architecture • Provide a testbed for demonstrating and testing new digital library components • Work with variety of researchers: DLI, ERCIM, Los Alamos

  8. A Vehicle for Exploring Policy and Organization • Creating a self-sustaining international federated digital collection • Extending the domain and scope while maintaining a coherent collection • Policy issues: charging, IPR, liability, technical quality, relationship • to other DL organizations

  9. Origins of NCSTRL • DARPA-funded CS-TR Project • CNRI, Berkeley, CMU, Cornell, MIT, Stanford • NSF-funded WATERS Project • Old Dominion, SUNY Buffalo, Virginia, Virginia Tech • Other CS Tech Reports Efforts • Harvest, UCSTRI, NZDL

  10. NCSTRL Project Participants • NCSTRL Steering Committee • NCSTRL Working Group • Cornell Digital Library Research Group • The Collection

  11. NCSTRL Steering Committee • Responsible for policy direction, oversight • How to broaden interoperability efforts into broader community

  12. NCSTRL Working Group • Responsible for operational oversight of the current system • Membership from CSTR and WATERS projects

  13. Cornell Digital Library Research Group • Responsible for day-to-day support and maintenance of existing system • Clearing house for technical collaborations • Evolution and Research Directions

  14. Contributing Institutions 105 Institutions in US, Europe, and Asia

  15. Dienst • is a protocol and reference implementation of a distributed digital library service • where a network of services provide • World Wide Web browser access, • uniform search over distributed indexes, • and multi-formatted documents.

  16. Document Handle (URN) decompositions representations logical physical metadata ASCII PostScript TIFF Dienst document model

  17. Exposing the Model through the Protocol • Documents addressable through their URNs • Document service requests • get document metadata • get document formats • get document in format • get document partition (page) in format

  18. WWW browser send search request send document request receive MIME-typed document receive unified hit list Dienst User Interface send site specific search request receive hit list send document request receive MIME-typed document Index Index Index Repository Repository Repository Dienst Services

  19. Exposing the Services through the Protocol • All protocol requests are service specific, • so the functionality of any service can be accessed by another service or a new service.

  20. User Interface Gateway Server Standard Servers FTP/HTTP “Repositories” Gateways to non-Conforming Sites

  21. Use by External Services User Interface Search Engine (Z39.50)

  22. Publishing Using DienstRetrospective Conversion • Scanning of legacy documents • Cornell • MIT • Stanford • Conversion to common formats • gifs • thumbnails • PostScript

  23. Publishing with DienstDigital Originals • PostScript as lingua franca • “thanks Microsoft” • Form submission • author-generated descriptive metadata • Clerical clearing-house • Automatic format conversion

  24. Collection Definition in Digital Libraries • Multiple levels of selection • authors “publish” • repositories have submission policies • search engines index • objects in search engines aggregated into collections • user interface gateways provide access to multiple collections • What is “in” a digital library is defined by what can be found using its resource discovery tools

  25. Defining the Collection -Collection Service

  26. Regional Structure central collection server

  27. Connectivity Regions and Collection Views

  28. Improvements to the Protocol - Dienst 5 • Incremental enhancement to existing interoperability framework • Improved document model • versions • hierarchical part specification • binders (multi-part documents) • Implementation currently under development

  29. Dienst 5 Document Structure • Structure Request • Reveal, in XML, full or collapsed structure of a document • e.g., chapters, sections, figures, etc. • Describe multiple views of a document • e.g., bibliography, content, thumbnails

  30. Dienst 5 Document Dissemination • Disseminate Request • Access to component(s) described by Structure • e.g., disseminate chapter 2 page 5 in PostScript

  31. Supporting Multiple Collections • NCSTRL is currently a single collection • Other users of Dienst protocol • European gray literature, thesis, and dissertation collections • NASA space science • Mediterranean environment data and software • Los Alamos Pre-prints • Expanding the technology to multiple collections through regions

  32. Lessons Learned and Work to be Done • Intellectual property • Quality • quality of collection (reviewing) • quality of metadata • quality of service • Resisting information entropy • Richer “documents” • Archiving and Preservation

More Related