1 / 26

The Digital Library: Current Technologies and Challenges

The Digital Library: Current Technologies and Challenges. William H. Mischo w-mischo@uiuc.edu Grainger Engineering Library Information Center University of Illinois at Urbana-Champaign SLA Global 2000 October 18, 2000. Outline. Definition of Digital Library. Elements of a Digital Library.

udell
Download Presentation

The Digital Library: Current Technologies and Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Digital Library: Current Technologies and Challenges William H. Mischo w-mischo@uiuc.edu Grainger Engineering Library Information Center University of Illinois at Urbana-Champaign SLA Global 2000 October 18, 2000

  2. Outline • Definition of Digital Library. • Elements of a Digital Library. • Full-Text Document Technologies. • Illinois Testbed. • XML: its Role and Importance. • Distributed Repository model. • Role of Libraries and Librarians.

  3. The Digital Library • ‘Digital’, ‘Virtual’, ‘Electronic’ Library as network-based library without regard to place and time. • Implementation issues. • Digital Collections vs. Digital Library. • Must Emphasize Integration of Collections and Services.

  4. Elements of DL • Collections. • Services. • Technologies and Standards. • Integration of All.

  5. Full-Text Technologies • Continuum of Web-Enabled Technologies. • Evolving Technologies and Standards. • All Presently being Utilized. • Role and History of Markup. • XML: its Role and Importance. • The Smart Document.

  6. Illinois DLI-I Project • Funded under DLI-I by NSF, DARPA, and NASA, 1994--1998. Awards made to 6 universities. • Large-Scale Testbed, Distributed Repository Models, Evaluation, Web Software. • CNRI D-Lib Test Suite Program, 1998—2001. • Collaborating Partners Program. AIP, APS, ASCE, IEE, NRL, ASM, ACM, NTT Learning Systems, Elsevier.

  7. Illinois Testbed • American Institute of Physics--APL, JAP, RSI • 16,000+ articles, 1995--. • American Physical Society--PRL • 10,000+ articles, 1995--, weekly updates. • ASCE Journals (25 titles) • 9,000+ articles, 1995--. • IEE Proceedings and Electronics Letters • 8,500+ articles, 1993--. • ASM (American Society for Materials) Handbook. • ACM (Association for Computing Machinery). • Elsevier Science.

  8. Project Issues • Evolution of the Document. • Information Environment. • Use of Metalanguages & Transformations (SGML, XML). • Searching over Full-Text of Journals vs. Abstract & Index Service Database. • Rendering and Styling (SGML, XML, MathML). • Dynamic Metadata for Normalization, Linking. • Breadth and Depth of Collections. • User Needs.

  9. Accomplishments • Process & Retrieve from Multiple Publishers & Heterogeneous DTDs. • Cross-Repository Searching. • SGML to XML Conversion. • Metadata Extraction, Representation, Merging. • Transformation & Rendering Technologies. • Dynamic Linking: Forward/Backward, from/to A & I Services.

  10. Ongoing Investigations • Support simultaneous searching of A & I Services, Distributed Repositories, enhanced navigation, expanded gateway functions. • Metadata Harvesting: Replicative or Distributed Approaches. • Z39.50 protocols, HTTP Harvesting, Spider Technology. • Archiving of Electronic Resources. • Local Resolution of Resources.

  11. XML (eXtensible Markup Language) • Subset of SGML, a Data Description Language (Metalanguage). • Allows fine-granularity markup of content and structure. Author can create their own elements (extensible). • Tags define the Structure of Document not Presentation Format. • Two types of valid XML: well-formed document structure without DTD and well-formed with validating DTD. • Displays natively only in IE 5.0 and Netscape 6.0. • Powers B2B, compatible with Relational DBs.

  12. Role of XML • “If you ask 20 people in the industry, ‘what is XML?’ You’ll get 20 different answers – Dale Fuller, CEO, Inprise Corporation. • Vendor-Neutral, Platform-Independent Structured Information Standard. • Document Representation and Interchange Standard. • Applications can externalize their data as XML. • XML data, CSS presentation layer, XSL to modify the structure of the document.

  13. Distributed Repository Model • Information Environment in which we Operate. • Web-Based and Publisher-Centric. • Multiple Relationships and Nodes. • Need for Gateway and Navigation Tools. • Need for Integration, Linking. • Publisher Repository approaches to Retrieval. • A and I Service Issues.

  14. Distributed Repository Issues • Integration of discrete publisher repositories, local and remote A & I services, OPAC, Web resources, and local data within gateway and navigation tools. • Issues for user access: • need to identify appropriate publisher repository, but presently interfaces are different and full-text and controlled vocabulary searching often not offered. • A & Is: not full-text but offer controlled vocabulary, no links to full-text repositories.

  15. Distributed Repository Search • Needed feature set: • A & Is: need links to full-text at article level via Digital Object Identifier (DOI), vocabulary switching within controlled vocabularies. Will we see consolidation of A & I services? Other information providers? PubMed/PubRef, PubSCIENCE (DOE/OSTI) • Publisher metadata repository for central searching; deposit metadata in conjunction with DOI. • Browser technology that fully incorporates XML, CSS.

  16. Digital Object Identifier (DOI) • DOI is both a unique identifier of a piece of digital content AND a system to access that content digitally. • ‘The ISBN for the 21st Century’ -- Norman Paskin. • DOI system has two main parts: (the identifier and a directory system) and a third logical component, a database. • Developed by AAP (Association of American Publishers), now managed by International DOI Foundation.

  17. DOI Construction • First open standard for content identification. • DOI is a number that identifies a digital object: • 10.1063/S000369519903216 • 10 Registration Agency Prefix • 1063 Publisher Prefix • S000369519903216 Suffix (Publisher-assigned ID) • Suffix can be SICI or PII. • DOI and URL pointing to the digital object, is registered with the International DOI Foundation. • 10.1234/4356 | http://www.pubsite.org/apr99/artl1.pdf

  18. Using a DOI • DOIs are resolved using the Handle System technology from CNRI (Corporation for National research Initiatives). • Retrieval of object is two step process: link is sent to central directory where current Web address is stored, location is sent back to browser with special message to redirect to address, e.g: • dx.doi.org/10.100/1 redirects to www.pub/art1.pdf • CrossRef Project: major Sci-Tech professional societies and commercial publishers.

  19. Reference Linking • In some fields, e.g. Physics, publishers have linking agreements already in place. • Alternatives to DOI: • PubMed/PubRef (National Library of Medicine) • PubSCIENCE (DOE/OSTI) • OpCit project • Proprietary Link Managers (AIP, APS) • System design calls for one URL for each DOI; underlying technology can handle multiple URLs however.

  20. Current Work • Pilot Project involving CNRI, SFX, Academic Ideal. • OpenURL Protocol. • Recent Letter to CrossRef and IDF. • Demonstration Project at Illinois and OhioLink. • Local Resolver. • Localizing Name Resolution for AIP, ASCE, Elsevier, other publishers. • Use of CrossRef Metadata Database for identifying Publisher from DOI and linking to Local Copy, A & I Services, Library Assistance.

  21. Computer Technologies • XML Appliances: Intel XML Accelerator. • Thin Desktops: • Legacy-free PCs; • Network appliances (Sun Rays). • Ubiquitous Computing: • Pocket PCs --Windows CE machines; • PalmPilots.

  22. Wireless Technologies • Wireless Computing • Security issues; • Bandwidth and throughput limited; • CDPD (Cellular Digital Packet Data); • Web clipping vs. portable HTML; • Cell Phone/Pocket PC combination. • With Pocket Devices, use by patrons and staff for remote search, processing.

  23. Role of the Sci-Tech Library • Function of Library: • Collect source materials; • Organize materials; • Provide access to materials. • Change: above activities are now distributed, not confined to a specific place. • Question: How do the support services for these activities need to change?

  24. Issues • Library as Function not Place. • Acknowledgment of and Support for the Library’s Role in the Campus Information Infrastructure. • Provide a ‘Digital Library’ out of digital collections. • Moving up on the Information Food Chain: personal collection, colleague, e-mail, Web, Library. • Archiving issues (Open Archive Initiative); Archive implies an access mechanism).

  25. 4th Generation Information System • Simultaneous Searching of Multiple Resources. • Remote Reference and Instruction (Collaboration and Whiteboard--apply Help Desk Software). • Software-Aided Search Navigation and Modification. • Dynamic Links to Full-Text. Appropriate Copy problem. • One-Stop-Shopping.

  26. Role of the Academic Librarian • In addition to Raising money & dealing with Publishers/vendors. • Experts in Information Seeking Process, Research, and Instructional Programs. • Knowledge of Emerging Information Technologies. • Ability to Work Effectively at Campus Level. • Ability to Train, Mobilize, and Enthuse Staff. • Cooperative Endeavors with other Departments, Grant Agencies, and Government Agencies.

More Related