370 likes | 385 Views
Discover ACERA's approach to system requirements, design, and architecture development for a robust platform. Explore the ERA Reference Architecture, SOA paradigm, and OAIS model integration to ensure long-term system survivability and ease of evolution. Learn about creating and managing different versions of electronic records for preservation and access.
E N D
ACERA 2011 April 6, 2011 ERA Technology and Development Strategy (System Architecture Development) Meg Phillips and Quyen Nguyen ACERA - April 2011
Agenda • System Requirements • Design Approach • System Architecture • Related Work • Conclusion 2 ACERA - April 2011
System Requirements • Extensibility: record types, data types, and services could be added without extensive redesign. • Evolvability: new technologies could be inserted using standards APIs and interfaces. • Availability: key functions must be highly available. • Scalability : adapt to record volume and user community growth. • Security: protection of system and its assets. • User Friendly: browser interface, intuitive, 508 compliance. ACERA - April 2011
Design Approach • Develop ERA Reference Architecture • Correct deficiencies in I1 • Architecture Tool to guide current and future design and development starting I3 • Goal is to build a Robust Platform • to develop, add and enhance services and applications • Adaptive to changes, especially business rules • Foundation for Preservation and the Access framework, whose components evolve at different pace • Fast pace for latter due to Internet, Web 2.0, Social Media. • Slow pace for former • Standard Interface is key • Open standards from Presentation to Backend layers • Domain standards such as OAIS (Open Archive Information System) and PREMIS (PREservation Metadata Implementation Strategies) • Data-minded and Security-minded 4 ACERA - April 2011
Design Approach: Reference Architecture • Facilitate system evolution to new technologies such as Cloud Computing, Web 2.0 Social Media, and future technologies. • Long term survivability of system • Take advantage of new technologies: potential reduction of lifecycle cost. • Follow federal mandate and better serve public (e.g. Open Gov.) • Reference Architecture helps us leveraging Community Support • Very important due to some uncharted territory • Take advantage of community expertise • Reduce development cost • Well-defined system interfaces • Well-defined Data and Metadata Model • Publish Reference Architecture 5 ACERA - April 2011
System Architecture: Three Pillars Evolvable System Architecture 6 ACERA - April 2011
System Architecture : SOA Paradigm Evolvable System Architecture 7 ACERA - April 2011
OAIS Reference Model 8 ACERA - April 2011
Designing Services [mesoa 2009] • Service-Oriented Architecture (SOA) Paradigm: • Services • Enterprise Service Bus (ESB) • Starting from OAIS model, design Business Services: • Ingest • Preservation • Access • Design lower level services to support those Business Services • Tool Services: Virus Scan, File Format Identification, etc. • Common Services: Logging, Authorization, etc. • Composition of low-level services into business services made possible by ESB with standards-based middleware • Flexibility and extensibility: add and replace services 9 ACERA - April 2011
Ingest Process • Evolvable Architecture allows integration with various file identification tools: DROID, Jhove, JAI, pCOS, etc. • Web Services made out of Tools (COTS, FOSS) • Old tools can be replaced by new tools. • New tools can be added. • Capability allows the system to leverage open software developed by the digital library and archiving community. 10 ACERA - April 2011
Preservation: Transformation Process • Evolvable Architecture allows integration with various transformation tools depending on the file types. • Tools are web services • For the same file type, a new transformation tool with better conversion can be added. • For a new file type, a new transformation tool can also be added and used. 11 ACERA - April 2011
Preservation: Future Choice of Strategy • Evolvable Architecture allows adding a new branch for a new preservation strategy 12 ACERA - April 2011
System Architecture: Metadata Model Evolvable System Architecture 13 ACERA - April 2011
ERA needs the capability to create and manage different versions of an electronic record, and relate them to a single logical entity: preservation, redaction ACE: Motivation TIFF JPEG Image of Gen. George B. McClellan McClellan.tiff McClellan.jpg Online Access Version Digital Master Version ERA Transformation Tool MS Word .Doc Memorandum Preservation Version Original Version ACERA - April 2011
PREMIS-based ACE 15 ACERA - April 2011
ACE Structure • Multiplicity of Representations and Objects • Usage • Preservation transformation • Redaction • Relationships • With Business Objects • Between representations • Between Objects • Multiple pages of a digitized record • Extensible implementation which could be used in future for: • Archival Description • Technical Metadata of Digitized Materials 16 ACERA - April 2011
Archival Asset Package [nist 2010] • Adherence to Archival Information Package (AIP) in OAIS • Self-contained digital object • Data model used to Import & Export between services and systems ACERA - April 2011
System Architecture: Content Server Evolvable System Architecture 18 ACERA - April 2011
Content Server within OAIS Model 19 ACERA - April 2011
Content Server [syscon 2010] • A Content Server is a logical construct to store and manage both data and metadata encapsulated in an Archival Asset Package (AAP): • Insert, Retrieve, Update, Delete and Search • Expose a simple interface • Hide specific implementation of underlying storage management system • Allow the system to have various technologies • System can evolve to new technologies • Allocation Policy can be based on business needs and requirements • Different data collections: Federal, Presidential, Legislative, Census • Security and access control considerations ACERA - April 2011
Related Work • Survey of system architecture designed and developed for digital preservations and archives • Validation of our approach • Evolvability, extensibility and pluggability of services achieved by SOA • Planets project funded by the European Union • National Library of Australia • Portuguese National Archives RODA (Repository of Authentic Digital Objects), etc. • Content Server similar to Content Manager used in the system of the Royal Dutch Library based on IBM Digital Information Archiving System (DIAS). 21 ACERA - April 2011
Conclusion: Summary • ERA Reference Architecture is evolvable and extensible thanks to the synergy of the three pillars: • SOA Paradigm • Metadata Model • Content Server Concept • Based on open standards: OAIS, PREMIS, XML, Web Services • Implemented in the I3 release • Benefits seen in Option Year 5 • Ease of modifying, and branching existing workflow • Reuse of underlying services • Facilitate development of Preservation Transformation framework • Creation of Transformation Strategy and Job Definition based on XForms and workflow middleware • Positioned to take advantage of software tools and components developed by the digital preservation community ACERA - April 2011
Conclusion: Future Direction • NARA Internal Community • Externalize architectural components such as ESB and Web Services to promote reuse. • Publish well-defined system interfaces • Publish well-defined Data and Metadata Model • Federal Agencies • Share and learn experience with other agencies such as LOC, GPO, NASA, and others. • Larger Community • Collaboration with other archives and digital libraries • Collaboration with Research community for Ingest, Preservation and Access functionalities. • Identify areas of possible usage of Free Open Source Software ACERA - April 2011
Publications [syscon 2010] Quyen L. Nguyen, Alla Lake and Mark Huber. “Evolvable and Scalable System of Content Servers for a Large Digital Preservation Archives”. Proceedings of 4th Annual IEEE Systems Conference , April 5-8, 2010, San Diego. [nist 2010] Quyen L. Nguyen and Dyung Le. “Archival Asset Package Design Concept for an OAIS System”. Proceedings of US Workshop Roadmap development for Digital Preservation Interoperability Framework (DPIF).NIST, Gaithersburg, Maryland, March 29-31, 2010. [mesoa 2009] Quyen L. Nguyen. “Towards a Design Approach for an Effective System Evolution of a Large Electronic Archive Information System”. Proceedings of 3rd International Workshop on a Research Agenda for Maintenance and Evolution of Service-Oriented Systems, September 20-26, 2009, Edmonton. [balisage 2009] Quyen L. Nguyen and Betty Harvey. “Agile Business Objects Management Application for Electronic Records Archive Transfer Process”. Proceedings of Balisage, the Markup Conference 2009, Aug 11-14 2009, Montreal. ACERA - April 2011
References • The Consultative Committee for Space Data Systems. “Reference Model for an Open Archival Information System (OAIS)”, 2002. • Preservation Metadata: Implementation Strategies (PREMIS). http://www.loc.gov/standards/premis/. • Robert Kahn and Robert Wilensky. “A Framework for Distributed Digital Objects”. International Journal on Digital Libraries (2006) 6(2): 115–123. • Adam Farquhar and Helen Hockx-Yu. “Planets: Integrated Services for Digital Preservation”. The International Journal of Digital Curation, Issue 2, Volume 2 | 2007. • IBM DIAS for The Royal Dutch Library. http://www-935.ibm.com/services/nl/dias/ref/references.html. • National Library of Australia. http://www.nla.gov.au/dsp/documents/itag.pdf • Jose Carlos Ramalho et al. “RODA and Crib – a Service-Oriented Digital Repository”. http://repositorium.sdum.uminho.pt/bitstream/1822/8226/1/RodaAndCrib.pdf. ACERA - April 2011
Thank You! Meg Phillips Meg.Phillips@nara.gov Quyen Nguyen qnguyen@nara.gov ACERA - April 2011
Backup Slides ACERA - April 2011
Evolution Relativity [balisage 2009] • Upper timeline shows evolution of the system itself. • Lower timeline shows evolution of external systems that created to-be archived data. • Note the lags between the two timelines (severalyears). • Challenge: evolving itself to use current technologies of epoch Ta in order to provide long-term access to data born out of technologies at Tc time.
High-level Architecture Roadmap In Place In Progress Future ACERA - April 2011
Planets Interoperability Framework • Planets’ core components: • Service Bus, and workflow • security, monitoring, transaction manager, etc. • Evolvability and extensibility: allow plugging of third-party services ACERA - April 2011
The Royal Dutch Library • Based on IBM’s Digital Information Archiving System (DIAS) • Core component is Content Manager to store and manage both data and metadata • Library Server: • Cataloging and indexing of metadata • Facilitate search and retrieval • Security Control for access • Object Server • Store actual digital objects ACERA - April 2011
Physical Implementation of AAP • ZIP and URL options for encapsulating files. 32 ACERA - April 2011
N-Part Identifier • Uniqueness • Within ERA; • Can be integrated with current and future standard protocols such as Handle, DOI, PURL, etc. • Allow access to different levels of the ACE structure • Identifiers can be assigned in a decentralized system • Example: • ID of an Electronic Asset & its Metadata: 1.1–6–200902.1 • The N-part ID can be made globally unique by prefixing it with the ERA namespace. • For instance, if “era.nara.gov” is used, then the above ID becomes: • http://www.era.nara.gov/1.1-1-200902.1 33 ACERA - April 2011
Possible Support of HTTP Protocol 34 ACERA - April 2011
Federators Global & Local • Standard Operations and Interfaces: • Put AAP • Get AAP • Update AAP • Delete AAP • Search: • 1. Metadata • 2. Asset ACERA - April 2011
Potential Access of Records in ERA If requested asset is in OPA’s local storage, just send out the asset to requestor. If requested asset with given URI is in ERA, then the request gets pooled and forwarded to ERA system, which will push the asset. ACERA - April 2011
Potential Reuse of Services • Cross-use of Services facilitated by ESB ACERA - April 2011