160 likes | 290 Views
Archive Ingest Redesign March 14, 2003. Archive Ingest Redesign high-level requirements. Port Ingest system from Open VMS to Unix Ingest will be the last remaining back-end function on Open VMS. Ingest will run under Solaris on the 15k
E N D
Archive Ingest RedesignMarch 14, 2003 Implementation Review
Archive Ingest Redesign high-level requirements • Port Ingest system from Open VMS to Unix • Ingest will be the last remaining back-end function on Open VMS. • Ingest will run under Solaris on the 15k • Make Ingest scalable for future increase in data volume post-SM4 • Improve throughput and reliability • Decouple Ingest from Distribution software for ease of operation and maintenance • Improve system maintainability • Facilitate Ingest changes that are driven by changes in data structure during science instrument lifetimes. Implementation Review
Current OPUS data processing / DADS Ingest interface • Historically, data processing and archive systems have developed independently. • Data processing system went from PODPS to OPUS. • Archive system went from DMF to ST-DADS. • In the past, these systems have not even operated within the same security environment. • This paradigm does not work with the current archive philosophy. • On-the Fly Reprocessing (OTFR) requires integration of data processing and archive distribution functionality. • Enhanced data processing, particularly database catalogs, requires closer coupling of data processing and archive system. • To address this change, software maintenance for data processing and archive systems now in one branch. Implementation Review
Current OPUS data process – DADS Ingest interface (cont.) Implementation Review
Ingest Functionality • Extract metadata from data header keyword values and populate archive science catalog • Write data files to archive storage media • Catalog location and properties of files in archive database • Validate integrity of data files • Set proprietary status of data files Implementation Review
Goals of Ingest Redesign project • Make Ingest more compatible with current science instrument design • It is almost impossible to enhance the fragile Open VMS DADS system for new science instruments without breaking existing functionality. • Bring Ingest requirements up to date • No longer support GEIS format in archive • Create final archive for HST first generation science instruments • No ingest of raw engineering data or subset engineering data • CCS is now HST engineering data archive • Improve operator control of the system Implementation Review
Status of Ingest Redesign project • Ingest Ops Concept complete and distributed on February 20, 2003 • Requirement definition in progress Implementation Review
Highlights of Ingest Ops Concept • Represents a significant simplification in the data system architecture • Deploy Ingest as a natural extension of data processing pipelines. • Build Ingest on OPUS architecture • OPUS software system has over 7 years of operational experience on HST • Risk mitigated by using a proven architecture • Time to deployment will be reduced • Consistent with JWST concept for data processing and archive systems • Same software will be used for both HST and JWST Implementation Review
Highlights of Ingest Ops Concept (cont.) Implementation Review
Highlights of Ingest Ops Concept (cont.) • Reduces amount of data shuffling and conversions between different software systems • E.g., current WFPC2 science data processing pipeline Implementation Review
Highlights of Ingest Ops Concept (cont.) • Reduces amount of data shuffling and conversions between different software systems (cont.) • Future WFPC2 science data processing pipeline Implementation Review
Benefits of Ops Concept • All operations on data handled in a single data flow. • Create FITS file, populate header keyword values, extract metadata from keyword values, populate science component of archive catalog • No duplication of development effort or functionality • Consistent development, testing, and operations helps insure quality of archive catalog • Facilitates easier delivery of header changes • Keyword changes can be built, tested, and deployed within a single subsystem Implementation Review
Benefits of Ops Concept (cont.) • Decouples Ingest and Distribution Software • Although both will utilize much of the same hardware such as the Data depot, 15k, and database • Provides opportunity for consolidation of OPUS and DADS based operator tools • Provides opportunity to automate data validation Implementation Review
Ingest Redesign Schedule • Ingest Operational Concept complete and distributed on February 20, 2003. • Requirement specification in progress • To be completed by April 15, 2003 • The remainder of the schedule is very preliminary pending requirement scoping and build planning • Design review: June 2003 • Phased development in OPUS builds between June 2003 and March 2004 • System tests: March – April 2004 • Deploy system: May 2004 Implementation Review
Summary of Data Systems software ports to Solaris • Over the last few years, HST data processing systems have been ported from Open VMS to Solaris: • OPUS infrastructure • Ported to Unix for FUSE – February 1998 • Current version tested under Solaris • HST Science Instrument pipeline applications • Ported to Tru64 Unix – October 1999 • Testing on Solaris in progress, minor changes anticipated • HST Engineering Data Processing pipelines • Ported to Solaris – February 2003 Implementation Review
Summary of Data Systems software ports to Solaris (cont.) • HST archive systems port from Open VMS to Solaris in progress: • Data Distribution system • completion expected in summer 2003 • Archive Ingest system • completion expected in spring 2004 • With completion of Archive Ingest System redesign project, all data systems will be running under Solaris. • No other major system enhancement projects expected through end of HST mission. Implementation Review