590 likes | 601 Views
NASDA IV&V ………. July 29th, 2003 @NASA IV&V Facility Naoki Ishihama, NASDA Hideki Nomoto, JAMSS/MIT Haruka Nakao, JAMSS. Contents. Introduction of NASDA IV&V approaches Project IV&V activities’ status and findings Satellites IV&V (SELENE, WINDS) JEM Pre-Shipping IV&V JEM IV&V HTV IV&V
E N D
NASDA IV&V ……….. July 29th, 2003 @NASA IV&V Facility Naoki Ishihama, NASDA Hideki Nomoto, JAMSS/MIT Haruka Nakao, JAMSS
Contents • Introduction of NASDA IV&V approaches • Project IV&V activities’ status and findings • Satellites IV&V (SELENE, WINDS) • JEM Pre-Shipping IV&V • JEM IV&V • HTV IV&V • Centrifuge IV&V • H-IIA Safety Range Ground system IV&V • New Methodology Research • Software QA activity and Process Improvement
NASDA IV&V Functions IV&V Planning • Project IV&V (Assessment ReportProject Evaluation) • Development Independent Assessment • Shipping Validation, Pre-Launch Review • Non-conformance Investigation/Assessment • Research of IV&V methodology and tool • Criteria • Handbook • Tools • Software Process Improvement • Software Quality Assurance 1. Introduction of NASDA IV&V approaches
IV&V approaches 1. Introduction of NASDA IV&V approaches
Satellite IV&V Activity- WINDS, SELENE - SELENE WINDS 2. Project IV&V activities’ status and findings
Satellite IV&V Project Characteristic • Basically non 2FT system (Fail safe, 1 fail operative) • One third of software/firmware are re-use. • High level specification defines detailed control theory and logic. • High speed calculation has become to be required. 2. Project IV&V activities’ status and findings
Modeling of Satellite Attitude Control Software Requirement using UPPAAL • Real-time Application • Hard to deal with time constraints “What” and “How” should we do? • We used UPPAAL. • UPPAAL: • Model networks of extended timed automata with typical data • Verify an invariant and reachability of the model 2. Project IV&V activities’ status and findings
Modeling of Satellite Attitude Control Software Requirement using UPPAAL Demo #1 ! 2. Project IV&V activities’ status and findings
Pre-shipping Verification for Japan Experiment Module (JEM) Software 2. Project IV&V activities’ status and findings
Project Characteristic All Development Activity had been done. Reverse Engineering Monstrous and complex on-board system relatively (DB) Need tool support Long term development and operation Need to keep the design rationale and records Pre-shipping Verification for JEM Software 2. Project IV&V activities’ status and findings
Equivalency Checking Tool of source code (ECT) • Equivalency Checking Tool of source code (ECT) is a reverse engineering tool that generates a flow diagram from a source code. • Modifying GCC(GNU Compiler) 2.95.3 and DOT included in GRAPHVIZ tool set 2. Project IV&V activities’ status and findings
Equivalency Checking Tool (ECT) Outline of tool procedure 2. Project IV&V activities’ status and findings
Equivalency Checking Tool (ECT) void start_waiting_jobs () { struct child *job; if (waiting_jobs == 0) return; do { /* Check for recently deceased descendants. */ reap_children (0, 0); /* Take a job off the waiting list. */ job = waiting_jobs; waiting_jobs = job->next; /* Try to start that job. We break out of the loop as soon as start_waiting_job puts one back on the waiting list. */ } while (start_waiting_job (job) && waiting_jobs != 0); return; } Flow Diagram Reverse Engineering Source Code
void start_waiting_jobs () { struct child *job; if (waiting_jobs == 0) return; do { /* Check for recently deceased descendants. */ reap_children (0, 0); /* Take a job off the waiting list. */ job = waiting_jobs; waiting_jobs = job->next; /* Try to start that job. We break out of the loop as soon as start_waiting_job puts one back on the waiting list. */ } while (start_waiting_job (job) && waiting_jobs != 0); return; } Flow Diagram Reverse Engineering Source Code Equivalency Checking Tool (ECT) Demo #2 !
Equivalency Checking Tool (ECT) digraph start waiting jobs{ label=”job.c:1583: void start waiting jobs()”; node [shape=box]; edge [weight=100]; start [shape=ellipse, height=0.01, label=”START”]; subgraph if 1586{ node [shape=box]; edge [weight=100]; Na1586 [shape=diamond, label=”1586: if (waiting jobs==0)\n”... start -> Na1586; Nb1587 [shape=ellipse, label=”1587: return;”, fontsize=14]; Na1586 -> Nb1587 [taillabel=”Y”]; } ……. } Medium Code 2. Project IV&V activities’ status and findings
Lesson and Learned • Flow diagram and additional information can be clearly compared with detailed design specification to determine the equivalency and differentials. • realize misunderstanding and miscommunication between programmer and system designer 2. Project IV&V activities’ status and findings
Light weight code checking tool: Splint • Comment (warning) is different with compiler. • Most security attacks exploit instances • Many of these flaws could be detected and eliminated before software is deployed • Not been integrated into the software development We used the Splint. • Lightweight static analysis to detect common security vulnerabilities • Coding mistakes (to avoid specific cord pattern which caused the past non-conformance) 2. Project IV&V activities’ status and findings
Benefit of Lint tool GCC compiler can’t find this area.
? …….. Light weight code checking tool: Splint void attitude_jobs () { …… …… switch (id_flg) { case 1: control(); …… case 2: stop_attitude(); …… case 3: restart_attitude(); …… …… } But case 4 ??? case 5 ??? ….. default ??? undefined Who am I? Where am I? What should I do? GCC: No warning!!! 2. Project IV&V activities’ status and findings
? …….. Light weight code checking tool: Splint void attitude_jobs () { …… …… switch (id_flg) { case 1: control(); …… case 2: stop_attitude(); …… case 3: restart_attitude(); …… …… } Demo #3 ! But case 4 ??? case 5 ??? ….. default ??? undefined Who am I? Where am I? What should I do? GCC: No warning!!! 2. Project IV&V activities’ status and findings
Japan Experiment Module (JEM) IV&V activityOur first target of IV&V efforts:Wrap-up of these 5 years 2. Project IV&V activities’ status and findings
The biggest issue of JEM Software • Complexity of Automated FDIR • Over 150 procedures which could conflict each other when executed simultaneously. if (MTL temp > max) stop pump-a close valve-a shutdown payload racks set pump-b 3500rpm open cross-over set pump-b 7000rpm if (MTL flow rate < min) stop pump-a close valve-a set pump-b 3500rpm open cross-over set pump-b 7000rpm if (NASA_H-EX_M is error) shutdown payload racks set pump-a 3500rpm open H-EX bypass valve open cross-over set pump-a 6000rpm 2. Project IV&V activities’ status and findings
Background • Formerly simple procedures had evolved into awfully complicated programs as system/safety constraints were recognized by the designers. @ CDR if (MTL flow rate < min) mask LTL flow rate check mask MTL temperature check if (current mode != 1WCL) if (NASA H-EX-MTL != error) if (current flow rate > 3500) stop low priority procs wait 15 sec send message "MTL-stop procedure going on. Do not bother me." stop pump-a else ...... @ PDR if (MTL flow rate < min) stop pump-a close valve-a set pump-b 3500rpm open cross-over set pump-b 7000rpm 2. Project IV&V activities’ status and findings
Initial JEM IV&V approach (~ 1999) • Build a dynamic simulator to check conflict • Difficulties of reproducing high fidelity execution environment (especially for fluid dynamics on zero G) • Cost of execution was enormous for good coverage of possible TIMING • Corrected JEM IV&V approach (2000~) • Build state-machine model for static analysis • Focus more on the software architecture • Focus more on the development of analysis algorithm • Provide top-down overhaul option (It gives cheap way in the end in many cases) 2. Project IV&V activities’ status and findings
Examples (Thermal control FDIR redesign) • Complete matching survey for FDIR • procedures coupling • Simply extract critical commands and their execution conditions to build a model • Check conflicts by simple matching algorithm for every possible coupling of the commands • Detected over 800 conflicts for the first round • Eliminated in the second round Iteration Cycle • Overhaul of the architecture performed for : • Parallel execution control mechanism • Scratch building the sequence and grouping 2. Project IV&V activities’ status and findings
Examples (File Transfer Protocol) • Back to basics • Simply build conventional state transition matrix from the design documents • Apply normalization techniques for the table which is quite popular in the relational database design • Check incomplete combinations of conditions • Detected over 20 incomplete conditions • Eliminated in the second round Solution was given by simply populating the table rather than spending long time with whacking a mole of test non-conformances. 2. Project IV&V activities’ status and findings
Lessons we learned • Think static • Static analysis eliminates the difficulty of timing issue • Static model reveals architectural solutions more explicitly than dynamic analysis • Static model inspires us to be creative and fun • Think basic • For the post-CDR systems, everyone's focus is distributed to different details and could fall into hell of complexity. Keep ourselves harmonized by looking back to the architecture. • Don't rely on the super fancy-sexy analysis from the preliminary risk assessment stage. They are expensive. 2. Project IV&V activities’ status and findings
H-ⅡA transfer vehicle (HTV)IV&V activity 2. Project IV&V activities’ status and findings
Finding • FDIR methodology validation using formal model revealed significant numbers of undefined failure modes (especially in quintuple voting mechanism among dissimilar navigation sensors). • Root cause of the design flaw: • FDIR design originally started from bottom-up approach for each combination of failures • The numbers of combinations inflated as the software was adapted to additional devices and more safety/mission constraints 2. Project IV&V activities’ status and findings
Issue • Quintuple voting mechanism shall handle 210 of • combinations of failures Unrealistic to have huge table to search (resource constraints) → ★Top-down study for a dedicated law to realize a feasible solution was required. 2. Project IV&V activities’ status and findings
× SensorA SensorE SensorB Agree SensorC SensorD Disagree Solution • “Graph algorithm” to detect Failed Sensor • Rule: “Identify failure modes by identifying closed triangle(s) in the chart” • 5 categories of failures • Single failure • Two failures • More than three failures • Unidentifiable failure • Incredible failure 2. Project IV&V activities’ status and findings
Merit • Using “Graph algorithm” • Simple and fast(Grouping of patterns , no large table search needed) • Mathematically complete • Reusable for any kinds of similar voting logic 2. Project IV&V activities’ status and findings
LSG CR Centrifuge Rotor (CR) and Life Science Grove-box (LSG) IV&V activity 2. Project IV&V activities’ status and findings
System Overview • Centrifuge is the experiment module for the life-science • Centrifuge Rotor (CR) : CR is rotational equipment for gravitational experiment The target S/W : Rotor Control Unit (RCU), (C-RIC), Laptop Hazardous Condition related with software is • Loss of spin control Catastrophic • Contact of rotating equipment Catastrophic • Rotor Dynamics hazard Catastrophic • Life-Science Grove-box (LSG): LSG provides an enclosed environment for biological specimens and chemicals Hazardous Condition related with software is • Specimen death Critical • Lost of experiment data Critical 2. Project IV&V activities’ status and findings
Software Requirement Specification System Specification (software part) IV&V Process overview Development Process Natural Language Document Analysis Formal Model ・Inconsistency ・Completeness ・Reachability PRR 1st Step 2nd Step ・Inconsistency ・Completeness ・Traceability SRR 2. Project IV&V activities’ status and findings
The 1st step- Modeling Natural language definition • AA Mode • The CR shall (REQ-01001) transfer mode from every mode without Initialization Mode or B Mode to AA Mode by command. Concerns: The ambiguous documentation and inconsistency often cause the former accident and incident. Natural language based Model AA Mode (REQ-01001) ‘CR is Stanby_Mode’ and ‘AA_Mode_Command is ON’ Or ‘CR is AA_Mode’ and ‘AA_Mode_Command is ON’ Or ‘CR is AB_Mode’ and ‘AA_Mode_Command is ON’ => CR is AA_Mode Improvements: We can analyze it formally with automated tools (Static Analysis) Formal Model (Matrix) CR = AA_Mode
The 2nd step - Analysis • ★Consistency Analysis on documentation ★Completeness Analysis on documentation →DEMO #4 • ★Traceability Analysis →DEMO #5 • ★Hazard Reachability Analysis →DEMO #6 2. Project IV&V activities’ status and findings
Natural Language Modeling and Analysis Tools (Demo #4) Tools: (run with gtk on multiplatform) Function • Natural Language Modeling • Model Checking • Consistency and Completeness Analysis Component Windows • Editor Window • Model Viewer • Analysis Result Viewer 2. Project IV&V activities’ status and findings
Editor Window (modeling window) chapter section Rule word State AA Mode (REQ-01001) ‘CR is Stanby_Mode’ And ‘AA_Mode_Command is ON’ Or ‘CR is AA_Mode’ and ‘AA_Mode_Command is ON’ Or ‘CR is AB_Mode’ and ‘AA_Mode_Command is ON’ => CR is AA_Mode 2. Project IV&V activities’ status and findings
Modeling Viewer Analysis Result Viewer 2. Project IV&V activities’ status and findings
About Consistency Analysis Condition B Condition A Condition C Last State Next State • Analysis which check that there is not two or more states transitions (Ex: Next State include the condition B and condition C….) when system condition transit from last state to next state. 2. Project IV&V activities’ status and findings
About completeness analysis • Analysis which check that the state corresponding to the defined conditions surely exists • The state of a system will be set to Unknown Mode when the defined conditions do not belong to all the state. Mode Parameter T:True F:False 2. Project IV&V activities’ status and findings
For example of completeness analysis: Mode α Yes = “T” No = “F” β β Yes Yes No No γ γ γ γ Yes No No No Yes No Yes Yes E E D ? C C A B Mode αβγ TTT TTF TFT TFF FTT FTF FFT FFF Unknown Mode 2. Project IV&V activities’ status and findings
Traceability of functions described in the Document at different level (e.g. requirements and specifications) must be confirmed → Evaluate automatically by TRAD tool PREPARATION: Translate functions to Formal model “Natural-Language Model” Example [CSM-01514] CR_Mode in Mode Rotation_Mode Operational_Mode_Transfer_CMD is ON (from Laptop) -> CR_Mode in Mode Operational_Mode DEMO #5TRAD(Traceability Diagnosis) Tool 2. Project IV&V activities’ status and findings
EXECUTION: (1) Conversion Formal Model of each document to UML Sequence Diagram in every function (2) Evaluate similarity of two Diagrams → Traceabirity Example RESULT: - Evaluate Traceability accurately in a very short time - Found miss-traced functions based on Traceability System Laptop CMD is ON Mode Change DEMO #5 TRAD(Traceability Diagnosis) Tool 2. Project IV&V activities’ status and findings
Preparation: <1> Formal Modeling of software specification <2> Formal Modeling of the hazard Objective of the integration: ★Generate software fault tree from system hazard ★Search for single failure point ★Search for state variables which eliminate hazard ★List up initial conditions to lead to the hazards DEMO #6Reachability Analysis Environment (RAE) 2. Project IV&V activities’ status and findings
H-ⅡA Flight Safety Control System (SIRIUS) IV&V activity 2. Project IV&V activities’ status and findings
System Overview • “SIRIUS” is the ground range safety control system for H-IIA rocket • SIRIUS provides • -Flight Path • -Area affected by explosion • -Attitude Inputs from -Radar -Telemeter -etc… Outputs for ground safety decision SIRIUS requirements ★Hard real-time (msec Synchronized) ★Non-Stop (of dozens of computers) ★ Built on top of commercial OS 2. Project IV&V activities’ status and findings
Analysis Approach • Hard Real Time on top of commercial OS • Extensive analysis of code implementation for: • Timing control parameters (Detailed validation of unit test results) • Signal handling mechanism (Graphical sequence modeling) • Global variable handling (Automated tool to exhaustive search for potential conflicts) • Frequently updated for specific requirements of each launch • Line by line comparison for all code from the previous version at the launch site 2. Project IV&V activities’ status and findings
Findings • Findings • Design flaw of process synchronization • Fluctuation of CPU load due to commercial OS system service • Lack of exception handling • Task priority inversion • Global variable conflicts • System configuration management problem 2. Project IV&V activities’ status and findings