210 likes | 220 Views
Discover how the Nikola Tesla Museum's clipping library was digitized to offer efficient access to valuable newspaper clippings on the life and work of Tesla. Learn about the challenges faced, the digital library prototype developed, and the implementation of a DBMS-based solution.
E N D
Nikola Tesla Museum Clipping Library Saša Malkov Nenad Mitić Žarko Mijajlović 3rd SEEDI Int.Conf. Cetinje, Montenegro 14. September 2007.
Clipping Library • Nikola Tesla Museum possesses a rich collection of newspaper clippings on work and life of Nikola Tesla • The clipping library is collected by Nikola Tesla, supported by his personal secretary • One part of the library is organized in books, while many clippings are not organized
Digital Library Prototype • Digitization Group at Faculty of Mathematics approached the development of digital clipping library prototype • Primary goals: • The problem analysis • Recognition of appropriate solutions
Problems • Significant variations in materials sources and qualities • The data and metadata organization and modeling • Data access
Differences in sources and preservation level • Different digitization techniques provide the different results, depending on paper and print type and preservation level • Different target formats are considered • Digital image formats • PDF • DejaVu format
Data organization • File systems are not appropriate • Complex data and metadata access • Limited search capabilities • Databases allow • Simpler access • Advanced searching
Automatic text extraction • Primary problems are : • Different languages • Large varieties and high font stylization used in the corresponding time period • Significantly low material quality, because of aging • Different OCR systems are evaluated • No OCR software satisfied, primarily because of the low material readability • Significant amount of manual corrections is necessary
Searching • The multiple criteria searching is essential, including searching by • Metadata • Caption • Key words • Publications • Language • Period • The clipping content • Manual corrections of text are essential • The efficiency require the application of some indexing methods
The solution – DBMS • The prototype is based on DBMS IBM DB2 • Advanced SQL implementation • Efficient handling of binary content • High concurrency level • High reliability • Good experiences • Free licensing terms
The solution – User interface • Web application concept is • Rich in content and visual presentation • Customizable • Portable • Relatively simple for implementing
The solution – Application • The library prototype is implemented in functional programming language Wafl • Wafl is designed for automatic document generation and particularly customized for Web development • Features very simple and efficient database access
Nikola Tesla Museum Clipping Library Saša Malkov Nenad Mitić Žarko Mijajlović 3rd SEEDI Int.Conf. Cetinje, Montenegro 14. September 2007.