720 likes | 733 Views
NPP Atmosphere PEATE Critical Design Review. Part 2: Science Processing System Scott Mindock. Presented by the Atmosphere PEATE Team Space Science and Engineering Center University of Wisconsin-Madison 5 June 2008. Agenda. Design System Subsystems Software Development and Verification
E N D
NPP Atmosphere PEATECritical Design Review Part 2: Science Processing SystemScott Mindock Presented by the Atmosphere PEATE TeamSpace Science and Engineering CenterUniversity of Wisconsin-Madison5 June 2008
Agenda Design System Subsystems Software Development and Verification Science Investigation Support Logistics
Definitions UML - Unified Modeling Language Use Case - Captures requirements at functional level Activity - Describes the steps of a use case Package - Show software structure and dependencies Class - Show software structure and dependencies Web Services - SOAP XML based communications
Sample Activity Diagram Like Flowchart Ovals = Activity Rectangle = Data Action A and B Decoupled Dot = start Circle = end
Agenda Design System Subsystems Software Development and Verification Science Investigation Support Logistics
Design: Goals (1 of 3) Maintainability - System lifetime spans years Reusability - Subsystems and design patterns Testability - Subsystem dependencies are managed Scalability - Design must scale to larger and smaller systems
Design: Methodology (2 of 3) • Learn from others - IDPS, Clusters, Ocean PEATE, Disney Parkwide • Leverage existing proven technologies and standards • Java • Web Services - Provide communications between subsystems • Apache Tomcat - Web Application Server • Apache Axis2 - Web Application providing web services • JAXB - Java XML Binding • ANT - XML Structured make • JUNIT - Java Unit Test • Eclipse - Java IDE • Subversion - Revision control • Prototype key features - Demonstration Projects • Loosely Coupled System - Manage dependencies, Define interfaces
Design: Patterns ( 3 of 3 ) All systems use XML configuration. All configurations have a XML schema. System persistence leverages schema based code generation. All systems can persist to DB or XML (scalable) Major components are deployed web services. (decouple, scalable) Test patterns used for scenario based testing / system verification
Agenda Design System Subsystems Software Development and Verification Science Investigation Support Logistics
Agenda Design System Subsystems Software Development and Verification Science Investigation Support Logistics
Subsystems • Ingest : ING • Brings data into the Atmosphere PEATE • Supports FTP, HTTP and RSYNC • Data Management System : DMS • Stores data in the form of files. • Provides a Web Service to locate, store and retrieve files. • Computational Resource Grid : CRG • Provides Web Service to locate, store and retrieve jobs • Algorithm : ALG • Consumes jobs • Runs algorithms in form of binaries • Algorithm Rule Manager: ARM • Combines data with algorithms to produce jobs • Provides Web Service interface to locate, store and retrieve rules • Production Rules: RUL • XML / Java packages to express production rules
Subsystem Relationships DMS : Data Management System Stores Data CRG : Computational Resource Grid Job Management ARM : Algorithm Rule Management Applies Product Rules to Data ING : Ingest System Brings Data into System RUL : Product Production rules Ties software packages to data ALG : Algorithm Host Runs software packages Arrows denote dependency System design minimizes dependencies Eases maintenance
Subsystem Relationships @ PDR DMS: Data Management System Stores Data CRG: Computational Resource Grid Processes Data ARM: Algorithm Rule Manage Applies Product Rules to Data ING : Ingest System Brings Data into System @PDR
System Components, Java Packages DMS : Data Management System Stores Data CRG : Computational Resource Grid Job Management ARM : Algorithm Rule Management Applies Product Rules to Data ING : Ingest System Brings Data into System RUL : Product Production rules Ties software packages to data ALG : Algorithm Host Runs software packages
ING: Ingest, bring data into system ( 1 of 2 ) • Customization allowed in form of scripts (BASH,PYTHON) • QC • Quick Look • Metadata extraction • Notices missing or late data
ING Configuration ( 2 of 2 ) Schema based persistence XML ingest type configuration Support for FTP, RSYNC, HTTP
System Components, Java Packages DMS : Data Management System Stores Data CRG : Computational Resource Grid Job Management ARM : Algorithm Rule Management Applies Product Rules to Data ING : Ingest System Brings Data into System RUL : Product Production rules Ties software packages to data ALG : Algorithm Host Runs software packages
DMS: Data Management System ( 1 of 4 ) Provides well-defined interface deployed as a web service. DMS is autonomous Provides storage Provides catalog Spans multiple file servers on a network DMS = Linux File system + PostgreSQL database
DMS File Properties ( 2 of 4 ) Each file in DMS has associated entry in catalog Important file characteristics are tracked Files are distributed on storage devices on entry
DMS Configuration ( 3 of 4 ) Schema defines DMS directory structure DMS creates directories on installation DMS uses round robin method to fill file systems Configurations not utilizing DB are restricted to one machine Max disk utilization is specified
DMS Components and Deployment ( 4 of 4 ) File system - hold files Database - holds file information Public Access - implement DMS interface Worker - manages file system Two flavors With DB - spans multiple machines Without DB - single machine
System Components, Java Packages DMS : Data Management System Stores Data CRG : Computational Resource Grid Job Management ARM : Algorithm Rule Management Applies Product Rules to Data ING : Ingest System Brings Data into System RUL : Product Production rules Ties software packages to data ALG : Algorithm Host Runs software packages
CRG : Provide processors with jobs Provides well-defined interface deployed as a web service. Accepts job request Provides Job Status Monitor Job State Scalable Testable
System Components, Java Packages DMS : Data Management System Stores Data CRG : Computational Resource Grid Job Management ARM : Algorithm Rule Management Applies Product Rules to Data ING : Ingest System Brings Data into System RUL : Product Production rules Ties software packages to data ALG : Algorithm Host Runs software packages
ALG: Runs software packages One or more per node Retrieves data from DMS Retrieves and runs software packages Saves results to DMS Consumes user jobs by using CRG Web Service
System Components, Java Packages DMS : Data Management System Stores Data CRG : Computational Resource Grid Job Management ARM : Algorithm Rule Management Applies Product Rules to Data ING : Ingest System Brings Data into System RUL : Product Production rules Ties software packages to data ALG : Algorithm Host Runs software packages
ARM: Bind data to software packages Provides well-defined interface deployed as a web service. Assigns jobs to CRG Monitors data in DMS Monitors the status of jobs in CRG Applies rules to data Volatile logic lives here Provides extension point for rules Rules can be added or removed dynamically
System Components, Java Packages DMS : Data Management System Stores Data CRG : Computational Resource Grid Job Management ARM : Algorithm Rule Management Applies Product Rules to Data ING : Ingest System Brings Data into System RUL : Product Production rules Ties software packages to data ALG : Algorithm Host Runs software packages
RUL Mechanism to deploy algorithms into APSPS Rules have names Rules use regular expressions URL of software package specified Destination DMS can be specified Days of lease can be specified
Assessment: Rules and Software Packages Used by ARM Specified by user Specifies software package to run (SoftwarePackage) Specifies data of interest (FilterRegex) Specifies product output location (DmsUrl) Specifies product lifetime (DaysOfLease)
Agenda Design System Subsystems Software Development and Verification Development process Testing Strategy Defect Reporting Defect Correction Configuration Management Nightly Build Unit and Regression Testing Testing Scenarios Requirements Mapping Science Investigation Support Logistics
Development Process: Spiral method Design Implement Build = Deploy to Operations Test Deploy
Testing Strategy ( 1 of 2 ) • Employ standard software industry practices • Automate with ANT, Make like, XML based • Test with JUNIT, Java Unit Test • Increases system quality • Tests are reproducible • Tests are run more often than they would be if they were manual • Tests are improved over time • Tests are configurable • We don’t just build, the process includes testing and verification
Development Process: Spiral method Design Implement Build = Deploy to Operations Test Deploy
Defect Reporting: Bugzilla ( 1 of 2 ) • Anyone with an account can report a bug • Reporter chooses software system • Reporter enters bug information • Owner of project is notified automatically by email • Owner prioritizes • Bugzilla • Defects • Features • Tasks
Development Process: Spiral method Design Implement Build = Deploy to Operations Test Deploy
Defect Correction Verify software before starting Verify problem exists Fix and Verify Add back to code base
Development Process: Spiral method Design Implement Build = Deploy to Operations Test Deploy