1 / 50

CIS 573 Computer Aided Verification

CIS 573 Computer Aided Verification. Carl A. Gunter Fall 1999 Part 3. Case Study. London Ambulance Service. Between October 26 and November 4 of 1992 a Computer Aided Dispatch system for the London area failed. Key system problems included need for near perfect input information

dexter
Download Presentation

CIS 573 Computer Aided Verification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CIS 573Computer Aided Verification Carl A. Gunter Fall 1999 Part 3

  2. Case Study London Ambulance Service • Between October 26 and November 4 of 1992 a Computer Aided Dispatch system for the London area failed. • Key system problems included • need for near perfect input information • poor interfaces between the ambulance crews and the system • unacceptable reliability and performance of the software • Consequences are difficult to measure, but severe in some cases.

  3. LAS Components

  4. Human Factors in Safety • Case Studies • USS Vincennes • Three Mile Island • Leveson Chapters 5 and 6 • Automotive Guidelines • Leveson Chapter 17

  5. Case Study USS Vincennes • On July 3, 1988 a US Navy Aegis cruiser shot down an airbus on a regularly-scheduled commercial flight. • Aegis is one of the Navy's most sophisticated weapon systems. • Aegis itself performed well. Human error was blamed: the captain received false reports from the tactical information coordinator. • Carlucci suggestion on user interface: put ``an arrow on [showing] whether it's ascending or descending.''

  6. Case Study Three Mile Island • On the morning of 28 March 1979 a cascading sequence of failures caused extensive damage to the nuclear power plant at Three Mile Island near Harrisburg Pennsylvania. • Although radiation release was small, the repairs and clean-up cost 1 to 1.8 billion dollars. • Confidence in the safety of US nuclear facilities was significantly damaged as well. • Operator error was seen as a major contributing factor.

  7. Generic Nuclear Power Plant

  8. TMI Components

  9. 1 Opens Fails Open Scram Maintenance Failure

  10. 2 Failed Open High Pressure Injection Pumps Boiled Dry Operator Cuts Back Water Flow Blocks Backup Failed Closed Water Pump

  11. 3 Failed Open Let Down Activated Boiled Dry Blocked Failed Closed Saturation Alarms

  12. 4 Failed Open Let Down Activated Cooling Activated Saturation Shut Off Pumps High Level of Neutrons

  13. 5 Failed Open Let Down Activated Closed Water Injected Hydrogen Explosion Saturation Fuel Rods Rupture

  14. Level 2 Conditions • No training of operators for saturation in the core. • Inadequate operating procedures in place. • Failure to follow rules for PORV. • Surveillance tests not adequately verified. • Control room ill-designed. • 100 alarms in 10 seconds • Key indicators poorly placed and key information not displayed clearly (example: cooling water converting to steam had to be inferred from temp and pressure). • Instruments off scale. • Printer not able to keep up.

  15. Level 3 Root Causes • Design for controllability. • Lack of attention to human factors. • Quality assurance limited to safety-critical components. • Inadequate training. • Limited licensing procedures.

  16. Requirements are very domain-specific. Given a sufficiently narrow domain, it is possible to provide more detailed assistance in requirements determination. We look at a set of guidelines for establishing user requirements for automotive software and translating these into software requirements. The guideline is that of the Motor Industry Software Reliability Association in the UK. Case Study MISRA Guidelines

  17. Scope of Guidelines

  18. Sample Life Cycle

  19. An automotive system must satisfy requirements that it not cause: Harm to humans Legislation to be broken Undue traffic disruption Damage to property or the environment (eg. emissions) Undue financial loss to either the manufacturer or owner Need for Integrity Levels

  20. Uncontrollable: Failures whose effects are not controllable by the vehicle occupants, and which are likely to lead to extremely severe outcomes. The outcome cannot be influenced by a human response. Difficult to Control: This relates to failures whose effects are not normally controllable by the vehicle occupants but could, under favorable circumstances, be influenced by a mature human response. Controllability Levels

  21. Debilitating: This relates to failures whose effects are usually controllable by a sensible human response and, whilst there is a reduction in safety margin, can usually be expected to lead to outcomes which are at worst severe. Distracting: This relates to failures which produce operational limitations, but a normal human response will limit the outcome to no worse than minor. Nuisance Only: This relates to failures where safety is not normally considered to be affected, and where customer satisfaction is the main consideration. Controllability Levels Continued

  22. To determine an initial integrity level: List all hazards that result from all the failures of the system. Assess each failure mode identified in the first step to determine the controllability category. The failure mode with the highest associated controllability category determines the integrity level of the system. Initial Integrity Level

  23. Integrity Analysis

  24. Integrity Levels Controllability Category Acceptable Failure Rate Integrity Level Uncontrollable Extremely improbably 4 Difficulty to Control Very remote 3 Debilitating Remote 2 Distracting Unlikely 1 Nuisance Only Reasonably Possible 0

  25. Example • Here is an attempt at an analysis of a design defect in the 1983 Nissan Stanza I used to own. (It wasn't a computer error, but a computer error might display similar behavior.) • Hazard Powertrain drive: loss of power. • Severity Factor Powertrain performance affected. • Controllability Category Debilitating • Integrity Level 2

  26. Human Error Probabilities • Extraordinary errors 10**-5: Errors for which it is difficult to conceive how they could occur. Stress free, with powerful cues pointing to success. • Regular errors 10**-4: Errors in regularly performed, commonplace simple tasks with minimum stress. • Errors of commission 10**-3: Errors such as pressing the wrong button or reading the wrong display. Reasonably complex tasks, little time available, some cues necessary.

  27. Human Errors Continued • Errors of Omission 10**-2: Errors where dependence is placed on situation and memory. Complex, unfamiliar task with little feedback and some distraction. • Complex Task Errors 10**-1: Errors in performing highly complex tasks under considerable stress with little time available. • Creative Task Errors 1 to 10**-1: Errors in processes that involve creative thinking, or unfamiliar, complex operations where time is short and stress is high.

  28. Recommendations • Level 0 is ISO 9001 • Each of the remaining 4 levels carries a recommendation for process activities on software with hazards at that level. • Areas Covered • Specification and design • Languages and compilers • Configuration management • Testing • Verification and validation • Access for assessment

  29. Specification and Design • Structured method. • Structured method supported by CASE tool. • Formal specification for the functions at this level. • Formal specification of complete system. Automated code generation (when available).

  30. Testing • Show fitness for purpose. Test all safety requirements. Repeatable test plan. • Black box testing. • White box module testing with defined coverage. Stress testing against deadlock. Syntactic static analysis. • 100% white box module testing. 100% requirements testing. 100% integration testing. Semantic static analysis.

  31. Verification and Validation • Show tests: are suitable; have been performed; are acceptable; exercise safety features. Traceable correction. • Structured program review. Show new new faults after corrections. • Automated static analysis. Proof (argument) of safety properties. Analysis for lack of deadlock. Justify test coverage. Show tests have been suitable. • All tools to be formally validated (when available). Proof (argument) of code against specification. Proof (argument) for lack of deadlock. Show object code reflects source code.

  32. Access for Assessment • Requirements and acceptance criteria. QA and product plans. Training policy. System test results. • Design documents. Software test results. Training structure. • Techniques, processes, tools. Witness testing. Adequate training. Code. • Full access to all stages and processes.

  33. Example Architecture Study Deliverables

  34. Software Requirements

  35. Testing

  36. Subcontracts

  37. Aristocracy, Democracy, and System Design • Conceptual integrity is the most important consideration in system design. • The ratio of function to conceptual complexity is the ultimate test of system design. • To achieve conceptual integrity, a design must proceed from one mind or a small group of agreeing minds. • A conceptually integrated system is faster build and to test. • Brooks

  38. Principles of Design • Norman offers the following two principles of good design: • Provide a good conceptual model. • Make things visible. Two important techniques are: • Provide natural mappings • Provide feedback • Donald A. Norman, The Psychology of Everyday Things.

  39. Examples of Bad Designs • Elegant doors that give no hint about whether or where to push or pull. • VCR's which provide inadequate feedback to indicate success of actions. • Telephones using too many unmemorable numerical instructions.

  40. Examples of Good Designs • Original push-button telephones • Certain kinds of single-handle faucets providing a natural mapping to desired parameters • Apple “desk-top” computer interface

  41. Do Humans Cause Most Accidents? • From Leveson, Chapter 5: • 85% of work accidents are due to unsafe acts by humans rather than unsafe conditions • 88% of all accidents are caused primarily by dangerous acts of individual workers. • 60 to 80% of accidents are caused by loss of control of energies in the system

  42. Caveats • Data may be biased or incomplete. • Positive actions are not usually recorded. • Blame may be based on assuming that operators can overcome all difficulties. • Operators intervene at the limits. • Hindsight is 20/20. • It is hard to separate operator errors from design errors.

  43. Two Examples

  44. A Model of Human Control

  45. Mental Models

  46. The Human as Monitor • The task may be impossible. • The operator is dependant on the information provided. • The information is more indirect. • Failures may be silent or masked. • Little activity may result in lowered attention or over reliance.

  47. The Human as Backup • A poorly designed interface may leave operators with diminished proficiency and increased reluctance to intervene. • Fault-intolerant systems may lead to even larger errors. • The design of the system may make it harder to manage in a crisis.

  48. The Human as Partner • The operator may simply be assigned the tasks that the designer cannot figure out how to automate. • The remaining tasks may be complex, and new tasks such as maintenance and monitoring may be added.

More Related