210 likes | 336 Views
Automatic Evaluation of Migration Quality in Distributed Networks of Converters. ECDL 05 Doctoral Consortium. Miguel Ferreira mferreira@dsi.uminho.pt Supervisors Ana Alice Baptista José Carlos Ramalho. 2005-09-21. Contents. Introductory concepts Research problems Proposed system
E N D
Automatic Evaluation of Migration Quality in Distributed Networks of Converters ECDL 05Doctoral Consortium Miguel Ferreiramferreira@dsi.uminho.ptSupervisorsAna Alice BaptistaJosé Carlos Ramalho 2005-09-21
Contents • Introductory concepts • Research problems • Proposed system • Methodology • Topics for discussion
Introductory concepts • Digital preservation • The set of processes and activities that ensure the continued access to information and all kinds of cultural heritage existing in digital formats • Digital object • An information object, of any type of information or any format, that is expressed in digital form • Text documents, digital photos, vector graphics, databases, Web pages, software
Strategies for digital preservation • Emulation • Reproduction of the behaviour of a hardware/software platform in a different technological environment • Encapsulation • Storing information about how the objects should be interpreted • Migration • Periodic transfer of digital materials from one hardware/software configurationto another • Others • Computer museums, viewers, Universal Virtual Computer
Migration • Advantages • Updated formats that users can read and edit • Disadvantages • Requires a continuous diligence • Data loss • Variants • Migration on request • Normalisation • Distributed migration
Distributed migration • A networkofremoteconversionservices supported by a semantic layer [Hunter et al.] • Advantages • Platformindependent • Redundancy • Multiple migration paths • Cost reduction • Compatible with other migration strategies • Disadvantages • bandwidth • Slow • Examples • PANIC • MyMorph (NLMed) • TOM
How to choose a preservation strategy? • Many preservationalternatives • Lack of universal acceptance • Distinct preservation requirements • Satisfaction of the designatedcommunity • Characteristics of the collection • Budget • Framework for evaluating preservation strategies[Rauber] • Utility Analysis
Evaluation of preservation strategies • Definition of objective tree • Assignment of measurement units(e.g. millimetre, Mb, Euro) • Identification of preservation alternatives • Execution of preservation alternatives and evaluation of the outcome • Weighting of criteria in the objective tree • Calculation of partial and total values • Ranking of alternatives
Research problems • Automation of preservation processes • Authenticity issues • Cost management • Evaluation of preservation alternatives
Research questions • Is it feasible to design and implement a system that is able to automatically: • determine theamount of data loss occurred in a migration and generate detailed migrationreports for inclusion in the objects’ preservation metadata? • providerecommendations of migration paths or target formats that will best suit users’ requirements?
Methodology - proof of concept The concepts • Automatic quantification of dataloss occurred in a migration and generation of preservation metadata • Automatic recommendation of migration strategies as well as target formats The proof (empirical validation) • Evaluator versus Humanexperts • Advisor versus Evaluationframework
Key contributions • For individual preservers, digital archives and libraries: • Outsourcing and automation of digital preservation • Generation of preservation metadata (authenticity) • Ranking of migration alternatives • For designers and programmers of converters: • Possibility of publishing their converters as services • For metadata creators and users: • Increase adoption • Help to improvefutureversions • Accelerate the development of XML bindings
Round-up • Service oriented architecture (SOA) • Automatic quantification of data loss • Provides recommendations on which migration paths or target formats are best suited for each user • Simplifies the creation of preservation metadata • Based on migration • Methodology • Proof of concept with empirical validation • Evaluator versus Humanexperts • Advisor versus Evaluationframework
Topics for discussion • Relevance of research • Research methodology • System architecture • Format registry vocabulary • e.g. MIME types, TOM type descriptors, Global Digital Format Registry, PRONOM, etc. • Preservation metadata schema • e.g. PREMIS data dictionary (event entity)