610 likes | 793 Views
Issued document 1.0. A Logical Model for Digital Archives. Rathachai Chawuthai rathachai.chawuthai@live.com . Information Management CSIM / AIT. Agenda. 22 nd Century Digital Preservation UCK Introduction Logical Model Prototype Related works. 22 nd Century.
E N D
Issued document 1.0 A Logical Modelfor Digital Archives RathachaiChawuthai rathachai.chawuthai@live.com Information Management CSIM / AIT
Agenda • 22nd Century • Digital Preservation • UCK • Introduction • Logical Model • Prototype • Related works
Example in 22nd Century What is ? File is read protected Error: No program can open file format .doc Please key password Error: DVD unreadable !7rò??àÕ??ߟ²ÂÚ Õ??ߟ²ÂÚ ðŽɳ !Z?g! Õr/ÕŸ/?rò?
Example in 22nd Century When was he born? Barack Obama 44th president of USA Born08/04 /1961
Overview • Digital preservation is an active management of digital information to ensure its accessibility over the time. • Digital preservation types • Bit PreservationAbility to produce a particular sequence of bits from storage media at any time. • Data PreservationAbility to rendered the produced bit stream and produce a meaningful output from it at any time. • Information PreservationAbility to understand the rendered digital object at any time Flouris (2007)
OAIS Workflow Ingest Query pkg pkg Disseminate Consumer Producer Access pkg Store Manage Management OCLC.org
PREMIS Overview • Information providing to support preservation management • Creator, created date-time • File format • Software / Hardware environment and version • Preservation activities, involving persons, and result • Historical changes from preservation activities • Decryption code • Font, formatting, color, look & feel • Right and agreement PREMIS from LOC.gov
Challenge • Information Preservation Conceptual Level Physical Level • Data Preservation • Bit Preservation Flouris (2007)
The Theory Steps towards a theory of information preservation GiorgosFlouris Flouris (2007)
Underlying Community Knowledge (UCK) • Designated Community (DC) • A group of people who share same knowledge • Underlying Community Knowledge (UCK) • Language • Contextual knowledge • Background knowledge • Commonsense Flouris (2007)
Problem UCK 2 UCK 1 Name : “RathachaiChawuthai” Read Write Producer Consumer First name = “Rathachai” Family name = “Chawuthai” First name = “Chawuthai” Family name = “Rathachai” Flouris (2007)
Approach UCK 2 UCK 1 Delta Name : “RathachaiChawuthai” Read Write Producer Consumer First name = “Rathachai” Family name = “Chawuthai” First name = “Rathachai” Family name = “Chawuthai” Flouris (2007)
Motivation ? Name= First name + Last name Name= Family name + First name ? UCK A UCK B
Motivation Everyone is able to understand digital information over the time
Motivation Reference ? Name = First name + Last name Name = Family name + First name ? UCK A UCK B
Objectives • To develop a theory for digital archives. • To design an information model representing contextual knowledge. • To develop a prototype system in order to test the theory.
Scopes • Do a theory by extending the existing theory of Flouris“Steps towards a theory of information preservation” (Underlying Community Knowledge) • Design “Contextual Model” from proposed theoryBy using linked metadata to model contextual knowledge • Refers to OAIS information model • Integrates with PREMIS metadata • Build an archival system • Refers to OAIS guideline • Supports case study of scientific research processes
Logical Model Proposed Theory
Goal • Theory is to • Have a reference contextual knowledge to identify differentiates between community knowledge • A model is to • Represent contextual knowledge • Be a reference for all underlying community knowledge • Identify differentiates between community knowledge • Capture change or evolution of the reference knowledge itself • Be able to link concepts among designated communities by reference contextual knowledge
UCCK Underlying Common Community Knowledge A common contextual knowledge for all underlying community knowledge
UCCK • Ca set of concepts. Ci∈ C • Ra set of Relations. Ri∈ R andRi→ C × C • HCa set of hierarchy of Classes. HC ⊆ C × C • Such as, HC(C1, C2) means that C1 is a sub concept of C2 • HRa set of hierarchy of Relations. HR⊆ R × R • Such as, HR(R1, R2) means that R1is a sub relation of R2 • ICa set of instances of C • IRa set of instances of R • A0a set of Axiom (Inference relations of logic) R C IR IC HC HR AO Yildiz(2006)
UCCK UCCK R C IR IC HC HR AO Derive UCK2 Derive UCK1
UCCK UCK2 UCK1 Pluto Astrodroid#1234
UCCK UCK2 UCK1 Pluto Astrodroid#1234
UCCK Solar System Planet UCK2 UCK1 Pluto Astrodroid#1234
UCCK UCCK UCK2 UCK1 Pluto Astrodroid#1234
UCCK UCCK UCK2 UCK1 Pluto Pluto Astrodroid#1234
UCCK UCCK v.2 UCCK v.1 UCK D UCK B UCK C UCK A Past Future
The Event Ontology Reimond (2007)
The Event Ontology Year 2006 Prague Pluto Mike Brown Changing Pluto Astrodroid#1234 Reimond (2007)
As an Consumers Archival Information System • Browse digital objects • Search relevance digital objects across repositories • Link to other related digital objects under contextual knowledge across systems • Customize own designated community Consumers Link Link Another Archival Information System Another Archival Information System
As an Archivist Archival Information System • Ingest digital objects • Define links to other objects • Manage metadata according to digital object’s type • Manage contextual knowledge • Manage relationships of documents from document process Archivist
As an Administrator Archival Information System • Define metadata for each type of digital object • Define underlying common community knowledge • Define underlying community knowledge • Define designated communities Administrator
Requirements • The system should be able to: • Manage variety types of digital objects and metadata • Establish relationship among digital objects semantically • Have semantic search • Provide context knowledge by linked metadata of digital objects for each designated community • Store knowledge as a graph of ontology
System architectures Administrator Archivist Consumer Archival Application Digital Archive User Interface Search Interface Administration Archival Service Provider Contextual Knowledge Mapping Service Digital Archive Core Service Semantic Search Service UCCK Manager UCK Manager Archival Data Digital Object Metadata Knowledge Base
Fedora-Commons • Repository system • Features • Collect digital objects and their relations • Collect metadata • Collect ontology • Support versioning • Only one repository system that • Support Semantic Search • Provide Web Services • Work as back-end services Duraspace.org
Drupal • Popular CMS • Features • Rich user management • Rich content management • Flexible for customized modules • Only one CMS that • supports SPARQL endpoint • Work as front-end service to end-user Drupal.org
Islandora • A Drupal’s module • Features • Provide administration panel • Provide fast-search to Fedora database • Support many formats of metadata • Support many types of digital objects • Only one Drupal’s module that: • Integrate with Fedora-Commons • Works with GSearchservice (Semantic Search of Fedora-Commons) • Work as front-end and administration services Islandora.ca
System architectures Archival Application Digital Archive User Interface Search Interface Administration Archival Service Provider Contextual Knowledge Mapping Service Digital Archive Core Service Semantic Search Service UCCK Manager UCK Manager Archival Data Digital Object Metadata Knowledge Base
System architectures Archival Application Digital Archive User Interface Search Interface Administration Archival Service Provider Contextual Knowledge Mapping Service Digital Archive Core Service Semantic Search Service UCCK Manager UCK Manager Archival Data Digital Object Metadata Knowledge Base
Related Works CASPAR SHAMAN Semantic More data for each DC Linked documents in process Linked knowledge Among digital archives across DC A Logical Model for Digital Archives
References • Weisz, T., 2007. The Kaifeng Stone Inscriptions Revisited • Rhys-Lewis, J., 2000. Conservation and Preservation Activities in Archives and Libraries in Developing Countries: An Advisory Guideline on Policy and Planning • Palfrey, J., Gasser U., Born Digital: Understanding the First Generation of Digital Natives • Yuan, L., Banach, M., 2011. Institutional Repositories and Digital Preservation: Assessing Current Practices at Research Libraries • Flouris, G., Meghini, C., 2007. Some preliminary ideas towards a theory of digital preservation • CASPAR, 2005. Cultural, artistic and scientific knowledge for preservation, access an retrieval. eu funded project (fp6-2005-ist-033572). http://www.casparpreserves.eu • SHAMAN, 2008. Sustaining Heritage Access through Multivalent Archiving. Eu funded project (fp7-ict-216736). http://shaman-ip.eu/ • Lagoze, C., Payette, S. Shin, E., Wilper, C., 2006. Fedora: an architecture for complex objects and their relationships
References • Albani, S., 2010. The ESA Approach to Long-Term Data Preservation using CASPAR • Borbinha, J., 2010. SHAMAN: Sustaining Heritage Access through Multivalent Archiving • CCSDS, 2003. 650.0-B-1Reference Model for an Open Archival Information System (OAIS). (ISO 14721:2003) http://public.ccsds.org/publications/archive/650x0b1.pdf • PREMIS Working Group, 2004. PREservation Metadata: Implementation Strategies. http://www.loc.gov/standards/premis/ • Berners-Lee, T., 2001. The Semantic Web • W3C, 2004. RDF/XML Syntax Specification, http://www.w3.org/TR/rdf-syntax-grammar/ • Yildiz, B., 2006. Ontology Evolution and Versioning: The state of the art. • Gustman, S., Soergel, D., Oard, D., Byrne, W., Picheny, M., Ramabhadran, B., and Greenberg, D., 2002. Supporting access to large digital oral history archives • Hayes, P., Eskridge, C, T., Saavedra, R., Reichherzer, T., Mehrotra, M., Bobrovnikoff, D., 2005. Collaborative knowledge capture in ontologies • Reimond, Y., Abdallah, S., 2007. The Event Ontology