170 likes | 340 Views
European Metadata Initiatives: The METAe Metadata Engine. Simon Tanner Higher Education Digitisation Service http://heds.herts.ac.uk. Overview. Introduction to HEDS. Current metadata contexts in Europe. METAe - The Metadata Engine Project Project summary Project objectives
E N D
European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service http://heds.herts.ac.uk
Overview • Introduction to HEDS. • Current metadata contexts in Europe. • METAe - The Metadata Engine Project • Project summary • Project objectives • Description of work • Benefits Simon Tanner http://heds.herts.ac.uk
Introduction to HEDS • HEDS provides advice, consultancy and a complete production service for digitization and digital library development. • Recent projects include: • Rekeying and tagging 35 million characters in Anthropology • 17th century Trade Directories • British newsreel scripts from the 1940’s • Transparencies - artwork, manuscripts, stained glass • Photographic prints and postcards - local history collections • Microfilm: manuscripts, political pamphlets • Consultancy: The British Library, Oxford University, New Opportunities Fund applicants. Simon Tanner http://heds.herts.ac.uk
Current metadata contexts: SCHEMAS: Forum for Metadata Schema Implementors http://www.schemas-forum.org/ “SCHEMAS will inform schema implementers about the status and proper use of new and emerging metadata standards. The project will support development of good-practice guidelines for the use of standards in local implementations. It will investigate how metadata registries can support these aims.” Simon Tanner http://heds.herts.ac.uk
Current metadata contexts: RSLP Collection Description http://www.ukoln.ac.uk/metadata/rslp/ “Based on a thorough modelling of collections and their catalogues, the project will develop a collection description metadata schema and associated syntax using the Resource Description Framework (RDF). We will develop a simple Web-based tool in order that projects can describe their collections and prototype a search service.” Simon Tanner http://heds.herts.ac.uk
Current metadata contexts: CEDARS: CURL Exemplars for Digital Archives http://www.curl.ac.uk/projects/cedars.html “There is a pressing need for a strategy for digital preservation… the CEDARS project aims to address the strategic, methodological and practical issues and will provide guidance for libraries in best practice for digital preservation.” CEDARS are identifying the descriptive metadata elements that should be gathered to maximize the continued accessibility of digital resources. Simon Tanner http://heds.herts.ac.uk
Presentation of an EU-project within the5th Framework Programme http://meta-e.uibk.ac.at/
Project summary • To make the digital conversion of printed material • more reliable in terms of digital preservation • more cost-effective in terms of automation • more attractive in terms of user-friendliness and accessibility. • METAe will develop a software package to extensively automate and improve the generation of metadata. http://meta-e.uibk.ac.at/ Simon Tanner, HEDS
Project summary • The goals will be achieved by applying new technologies for character, layout and document recognition. • The METAe package will convert the captured information into XML documents. • XML files serve as a basis for various applications, such as: new XML search engines, navigation tools, electronic books, audio books, or the automated production of HTML, XHTML, PDF or PS files. http://meta-e.uibk.ac.at/ Simon Tanner, HEDS
Participants: co-ordinator & technical partners • Co-ordinator: Leopold-Franzens-Universität, Innsbruck (A) • Institut für Angewandte Informatik, University of Linz (A) • Mitcom Neue Medien GmbH (G) • CCS Compact Computer Systeme (G) • Dipartimento di Sistemi e Informatica, University of Florence (I) • Scuola Normale Superiore, Centro di Ricerche Informatiche per i Beni Culturali (I) http://meta-e.uibk.ac.at/ Simon Tanner, HEDS
Participants: library & research partners • Universidad de Alicante (S) • Friedrich-Ebert-Stiftung (G) • Cornell University Library (USA) • Bibliothèque nationale de France (F) • The National Library of Norway (N) • Biblioteca Statale A. Baldini (I) • Karl-Franzens-Universität Graz, (A) • Higher Education Digitisation Service HEDS (UK) http://meta-e.uibk.ac.at/ Simon Tanner, HEDS
Project objectives • Introduction of layout and document analysis as a key technology in future digitisation software. • Development of capturing and conversion tools for the automated recording and generation of administrative and descriptive metadata. • Development of an omnifont OCR-engine specialised in processing old European typefaces of the 19th century („Fraktur“, Gothic fonts). http://meta-e.uibk.ac.at/ Simon Tanner, HEDS
Project objectives • Evaluation of digital preservation standards(i.e. XML, EAD, TEI or ISO 12083) • Development of an XML search engine for tagged full texts and images. http://meta-e.uibk.ac.at/ Simon Tanner, HEDS
Description of work 1. Input module for scanning and importing existing metadata 2. OCR-engine specialised in typefaces of the 19th century 3. Document analysis module 4. Page layout analysis module 5. Rules and controlled vocabulary for automated recognition process 6. Conversion module assembling an XML document containing all recognised metadata 7. Export module for the XML enriched document and the scanned image http://meta-e.uibk.ac.at/ Simon Tanner, HEDS
Benefits 1. Reduce the need for manual post-processing of scanned content. 2. Produce a rich output, with metadata on all levels: administrative, structural and format metadata. 3. Offer new possibilities for successful long-term preservation. 4. New ways to enhance access, re-use and multi-versioning. 5. Selective and distributed correction of OCR‘d content. 6. Benefits for the visually disabled and also in scenarios of functional disability. http://meta-e.uibk.ac.at/ Simon Tanner, HEDS
European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service Email: heds@herts.ac.uk http://heds.herts.ac.uk