130 likes | 237 Views
Digital Atheneum Project University of Kentucky. New Approaches to Restoring, Editing and Searching humanities collections. by Arpita Goenka. Principal investigators. Brent Seales. James Griffioen. Kevin Kernian. Introduction.
E N D
Digital Atheneum ProjectUniversity of Kentucky New Approaches to Restoring, Editing and Searching humanities collections. by Arpita Goenka
Principal investigators Brent Seales James Griffioen Kevin Kernian
Introduction • It is developing new methods to restore and make accessible previously lost writings • Focuses on recovering the manuscripts of the famous Cottonian Library Collection in the British Library. • Many of the manuscripts badly damaged by 1731 fire and further deteriorated due to neglect, misuse and other conservation methods. • Even with the best digitization techniques the remaining of the text is difficult or impossible to decipher. • With damaged manuscripts the search engine must be devised to accommodate partial and distorted forms.
Project focus • New digitization techniques to illuminate all possible information from the original manuscript. • Restoration algorithms to attempt to fill in the most likely missing information • Complex, data-specific, content search techniques to identify the imperfect representations found in severely damaged manuscripts.
Using Illumination to Enhance Digitization • The images must be obtained using extremely high-resolution cameras • Many parts of the damaged documents are invisible without special lighting techniques • The manuscripts are rarely completely flat or planar.
Illumination Techniqueusing ultraviolet lighting to reveal formerly invisible text
Limitation of 2-D imagery • When imaging a non-planar object-it can appear warped or crinkled. • Difficult to disambiguate if warping is part of original or an artifact of the object’s shape. • Provides insufficient look and feel • Items such as wax seals, coins, etc have inherent 3-D shape
y z x 3-D Acquisition setup • Introduce a light projector along with the camera to capture 3-D models. • Projector and camera are used to triangulate 3D points on the artifact’s surface
Depth Information • Depth information may also help solve the problem of accurately reuniting physically separate fragments
Searching Images with Computational Methods • Creating document specific image processing algorithms that can locate, identify and classify individual letterforms. • By analyzing several letterforms, computer models are built that can perform probabilistic pattern matching of damaged letterforms. • Transcriptions aid significantly in searching by narrowing the search space and assisting an editor who is struggling to decipher a charred leaf.
Editing and Annotating the Damaged manuscripts • Encoding the transcripts and edited texts in SGML to facilitate comprehensive searches,and are converting both to HTML or XML so they can be displayed by Internet browsers. • Developing a generic toolkit to assist other editors (like scholars in humanities) in assembling complex editions from high-resolution digital manuscript data. • An editor can then collect and create the components of an electronic edition for any work and use the generic toolkit to fashion a sophisticated interface for an electronic display of the edition.
Editor’s Tool • It must be Flexible: encompass many different collections • Usable: support non-expert computer users • Technically sophisticated: incorporate new technical solutions Functions include Registration, Mosaicing,Textual correspondence, Glossary contsruction
Refernces • "The Digital Atheneum - Restoring Damaged Manuscripts."RLG DigiNews 3:6 • "The Digital Atheneum: New Technologies for Restoring and Preserving Old Documents."Computers in Libraries 20:2 (February 2000), 26-30. • "The Digital Atheneum: New Approaches for Preserving, Restoring and Analyzing Damaged Manuscripts." Proceedings of the First ACM/IEEE-CS Joint Conference on Digital Libraries, 2001. (NY: ACM Press, 2001), 437-443. • http://www.dli2.nsf.gov/cornellworkshop/