1 / 47

Creating Working Digital Libraries

Creating Working Digital Libraries. Howard Besser UCLA School of Education & Information http://www.gseis.ucla.edu/~howard. Creating Working Digital Libraries-. Moving from Digital Collections to Digital Libraries Interoperability Importance of Standards Longevity

Download Presentation

Creating Working Digital Libraries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Creating WorkingDigital Libraries Howard Besser UCLA School of Education & Information http://www.gseis.ucla.edu/~howard

  2. Creating WorkingDigital Libraries- • Moving from Digital Collections to Digital Libraries • Interoperability • Importance of Standards • Longevity • Best Practices for Managing Digital Projects • Some Wild Musings

  3. Moving from Digital Collections to Digital Libraries • What’s the difference? • Recent history of Library Automation-

  4. Developmental Stages • Experiment with methods • Build real operational systems • Build interoperable operational systems

  5. DL DL DL DL search & presentation search & presentation search & presentation search & presentation user user Traditional Digital Library Model

  6. DL DL DL DL search & presentation user user Ideal Digital Library Model

  7. Developmental Stages • Experiment with methods • Build real operational systems • Build interoperable operational systems • For DL Initiatives • For OPACs • For I & A Services • For Image Retrieval

  8. Key problems we’re facing • Discovery • Interoperability- • Longevity-

  9. For Interoperability Digital Libraries Need Standards • Descriptive Metadata for consistent description • Discovery Metadata for finding • Administrative Metadata for viewing and maintaining • Structural Metadata for navigation • ... Terms & Conditions Metadata for controlling access...

  10. Metadata is not just indexing terms • CBIR attributes used for retrieval on color, shape, texture, etc. • Structural attributes used for page-turning • Administrative attributes used for managing a digital work over time • IPR attributes to limit unauthorized use • Identification attributes to determine what application software is needed to view a particular digital work • Can be located anywhere

  11. Why are Standards and Metadata consensus important? • Managing digital files over time • Longevity • Interoperability • Veracity • Recording in a consistent manner • Will give vendors incentive to create applications that support this

  12. Why Standards? • Why do we need standards? • To make information universally available to users • facilitate sharing and interchange of information • To preserve information (make it safe from changes in hardware and software) • Standards only work if communities widely accept them, but they’re necessary for communities to work together

  13. Serious Longevity Problems • What we know from prior widespread digital file formats • Images separating from their metadata • Inaccessibility of software needed to view an image • Inability to even decode the file format of an image

  14. Journal Archiving • License, don’t own; may not be even able to obtain right to make archival copy • Increasingly no paper back-up at all • Usually we don’t have the important redundancy factor • Stanford’s LOCKSS Project (Lots of Copies Keeps Stuff Safe) and its problems (http://lockss.stanford.edu)

  15. The Short Life of Digital Info: Digital Longevity Problems- • Disappearing Information • The Viewing Problem • The Scrambling Problem • The Inter-relation Problem • The Custodial Problem • The Translation Problem

  16. The Viewing Problem • Digital Info requires a whole infrastructure to view it • Each piece of that infrastructure is changing at an incredibly rapid rate • How can we ever hope to deal with all the permutations and combinations

  17. The Scrambling ProblemDangers from: • Compression to ease storage & delivery • Container Architecture to enhance digital commerce

  18. The Inter-relation Problem • -Info is increasingly inter-related to other info • -How do we make our own Info persist when it points to and integrates with Info owned by others? • -What is the boundary of a set of information (or even of a digital object)?

  19. The Custodial Problem • How do we decide what to save? • Who should save it? • How should they save it? • -methods for later access: emulation, migration, etc. • -issues of authenticity and evidence

  20. The Translation Problem • Content translated into new delivery devices changes meaning • -A photo vs. a painting • -If Info is produced originally in digital form in one encoded format, will it be the same when translated into another format? • Behaviors

  21. Pieces of the Solution (1/2) • -We need to insist upon clearly readable standardized ways for digital objects to self-identify their formats • -We should discourage scrambling • -We need to better understand information inter-relates to other Info, and what constitutes “boundaries” of Info objects

  22. Pieces of the Solution (2/2) • -People and organizations wishing to make information persist need guidelines of how to go about doing it • -We need to better understand how translating from one storage or display format to another affects the meaning of a work • -We need to save the “behaviors” of a digital object, not just it’s “contents”

  23. Metadata can be the first line of defense • Can tell you • where the file is (if you can’t find the file) • where more info about the file is (if you have the file but most other metadata has become separated) • what the file format is • what the compression scheme is • what application program and version is needed for the file

  24. Groups Working onthe Big Longevity Problemhttp://sunsite.Berkeley.EDU/Imaging/Databases/Longevity/ • CPA Task Force • Getty “Time & Bits” Conference & follow-up • NEDLIB, CURL, Michigan • Internet Archive • Long Now

  25. Migration/Refreshing • Impact on evidential value

  26. Best Practices for Managing Digital Projects- • Who will your users be? • Best Practices Guidelines • Workflow and Management Issues

  27. Why are you Managing this Information? • Organizational mission & type • Users • Uses

  28. Think about users (and potential users), uses, and type of material/collection Scan at the highest quality that does not exceed the likely potential users/uses/material Do not let today’s delivery limitations influence your scanning file sizes; understand the difference between digital masters and derivative files used for delivery Many documents which appear to be bitonal actually are better represented with greyscale scans Include color bar and ruler in the scan Use objective measurements to determine scanner settings (do NOT attempt to make the image good on your particular monitor or use image processing to color correct) Don’t use lossy compression Store in a common (standardized) file format Capture as much metadata as is reasonably possible (including metadata about the scanning process itself) Scanning Best Practices

  29. Why Scale is important

  30. Digital Object Behaviors • Book example

  31. Metadata Standards(from MOA2) • Administrative Metadata • for enhancing resource management • Structural Metadata • for reflecting internal hierarchies and relationships btwn parts • Raw/Seared/Cooked

  32. Workflow and Management Issues- • Managing multiple image files • Persistent Identification • Making your works accessible throughout the Net

  33. The number of variant forms of a work can be enormous • different views of the same object • different scans of the same photo • different resolutions • different compression schemes • different compression ratios • different file storage formats • different details of the same image • ...

  34. Image Families

  35. Identification/Provenance • how to deal with different versions (browse, hi-res, medium res) derived from the same scan or different encoding schemes (TIFF, PICT, JFIF) • Vocabulary Standards to express this • VRA Surrogate Categories • CIMI's "Image Elements”

  36. Persistent IDs--the Problem • Need to separate work ID from work location • URNs probably won’t be ready until 2003 • Becomes a business process issue when one organization maintains the resource and another organization references it (ie. licensed from vendors or managed by separate administrative structures)

  37. More Persistent IDs--the Approach for today • PURLs • Handles • HTTP redirects • And worry about costs now and conversion costs when URNs become feasible

  38. Data Set ManagementMore issues with referencing IDs • References for mirror sites • References for back-up sites when main site is down or bottle-necked • References for off-site copies and archival copies

  39. Making your works accessible throughout the Net • The DLF/Mellon meeting • An administrative and political issue as much as a a technical one

  40. Some Wild Musings- • Movement towards packages and away from MARC • The disappearance of OPACs

  41. Containers and Packages of MetadataWarwick, not MARC • modular • overlapping • extensible • community-based • designed for a networked world to aid commonality btwn communities while still providing full functionality within each community

  42. DC Qualifiers • allows one community to express important nuances and qualifications, while still making the basic importance available to communities with simple needs • our community can reflect alternate title, transliterated title, and main title, yet they will all be found under a simple Web search under “title”

  43. Crosswalks • mapping btwn differing metadata structures • eliminate the need for monolithic, universally adopted standards • focus on flexibility and interoperatiblity • RDF-based metadata registries

  44. Crosswalk Example

  45. Do we still need OPACs? • Why repeat almost identical bibliographic descriptions in each local system? • Why not store only local information locally, and link to bibliographic descriptions stored in the major utilities? • Could our acquisition systems for monographs begin to use the acquisition systems imposed on us by our parent organizations (like those for supplies)?

  46. Creating WorkingDigital Libraries- • Moving from Digital Collections to Digital Libraries • Interoperability • Importance of Standards • Longevity • Best Practices for Managing Digital Projects • Some Wild Musings

  47. Creating Working Digital Libraries Howard Besser UCLA School of Education & Information http://www.getty.edu/gri/standard/intrometadata/ http://www.ifla.org/II/metadata.htm http://sunsite.Berkeley.EDU/Imaging/Databases/#standards http://sunsite.Berkeley.EDU/moa2/ http://sunsite.Berkeley.EDU/Longevity/ http://purl.oclc.org/metadata/dublin_core/ http://www.gseis.ucla.edu/~howard/image-meta.html http://www.gseis.ucla.edu/~howard/Metadata/UC-May00/ http://sunsite.berkeley.edu/Metadata/sp2000.html http://www.gseis.ucla.edu/~howard/

More Related