180 likes | 255 Views
Injecting Faults for Error Evaluation. Kalynnda Berens Kalynnda.Berens@grc.nasa.gov Richard Plastow Richard.Plastow@grc.nasa.gov. Applications often consists of software components plus custom development, merged into a coherent package. COTS, GOTS, open source, etc.
E N D
Injecting Faults for Error Evaluation Kalynnda Berens Kalynnda.Berens@grc.nasa.gov Richard Plastow Richard.Plastow@grc.nasa.gov
Applications often consists of software components plus custom development, merged into a coherent package. COTS, GOTS, open source, etc. Source code is usually not available for review of quality and reliability. Visibility into the component is only what’s available via a public interface What is the quality of that component? What faults lay inside the component? Applications interface with hardware and other software and can be influenced by failures in those systems. Introduction
Fault Injection on Interfaces • Interfaces (hardware, software, human) are a major source of errors and induced faults • Software and system testing looks at anticipated off-nominal situations, but often misses unusual situations or combinations of faults • Mishap investigation has shown that multiple faults or unexpected anomalies are key players in accidents and mission failures
Example System System Hardware COTS Library Application Other Applications on same system Input Sensors Control Outputs COTS Operating System External Systems
Obtain Source Code and Documentation Identify Interfaces and Critical Sections Start Error/Fault Research Sufficient time and funds? Estimate Effort Required Select Subset Importance Analysis Yes Feedback to FCF Project Fault Injection Testing Test Case Generation End Document Results, Metrics, Lessons Learned Fault Injection Flow Diagram No
Interface Identification • Artifacts and Documentation • Software and System Requirements and Design specifications • Interface Specifications • User and Training Manuals • Other project documentation • Source code
Error Research • Sources of Error/Fault Information • Vendor documentation • Application or Hardware Interfaces • White papers • Defect list or open issue/problem reports • Public bug list • Internet Sources • Software logs • Error databases • Personnel Experience
Evaluation and Scoping • Determine level of effort, funding, time constraints • If complete effort not possible • Perform importance analysis of interfaces, software units • Safety • Complexity • Use by other system elements • Expected number or types of faults • Prioritize and select by importance
Testing • Test case generation based on identified errors plus permutations on possible input values • Consider multiple faults • Consider faults while system is off-nominal from a previous fault • Consider effects of system load/stress • Consider state-specific effects • Instrument software to observe effects of injected faults • External or observable effects • State changes (or lack of) • Effects on safety-critical functions
First Project: Tempest • Written in Java 1.1 • Configurable • Cross platform operability • Implements HTTP GET and HEAD Request and Server Side Includes • Has some Basic Security Features • Debug Mode monitoring • Commercially available
Requirement Database • Documents found contained 80 requirements that vendor says he meets (vendor’s claims) • Requirement table is the parent table for 5 other sub-tables • Performance • Specs • HFE • Security • Misc
Standards Database • Parsed commercial standards into pseudo functions and test scenarios • Test scenarios included expected faults as well as on-the-edges and way-outside-the-box tests • Each standard has its own table and own set of tests
Tempest Results • Inappropriate system operation with modified configuration file • Non-compliance with HTTP standard • System crash with invalid port numbers • Port 49151.45 -> opened port 80 • File access in server machine outside of authorized directories • System did not operate as per user documentation
Second Project:Fluids and Combustion Facility • Permanent, multi-user facility for ISS microgravity experiments • Two racks (fluids/FIR, combustion/CIR) • Operates for 10 years, so robustness important • CANbus processors selected for fault injection • Health and Status Monitoring • Cannot be upgraded in flight • Mature requirements, design, and interface definition • Source code available
CANbus Processors • Air Thermal Control Unit (ATCU) • Color Camera Package (CCP) • Common IPSU Diagnostic Board • FOMA Control Unit • FSAP Diagnostic Board • Nd:YAG Laser Package • Water Thermal Control System (WTCS) • White Light Package
ECS CANbus Input-Output Processor (IOP) IOP Main Processor IOP HRDL Processor FSAP IOP CAN Node Processor FSAP Main Processor IOP Video Switch Processor FSAP CAN Node Processor Ethernet Optics Bench CANbus Common IPSU IPSU Main Processor ISPU CAN Node Processor FIR System Diagram PI Package ATCU CAN Processor WTCS CAN Processor White Light CAN Processor Nd:Yag CAN Processor DCM CAN Processor Laser Diode CAN Processor
Off-Nominal (O-N) Power Down (P) Initialization Operational (OP) CANbus Processor State Diagram Power On Error Operational Cmd Success Error Power Down Cmd Error Operational Cmd Power Down Cmd Power Off
Next Steps • Complete Interface Identification and prioritization • Obtain hardware, source code for testing environment • Error/Fault search on selected interfaces/components • Test case generation, source code instrumentation, and test execution