180 likes | 254 Views
Design for Safety. Hazard Identification and Fault Tree Analysis Risk Assessment Define Safety Measures Create Safe Requirements Implement Safety (we will talk about software specifically here) Assure Safety Process Test,Test,Test,Test,Test.
E N D
Design for Safety • Hazard Identification and Fault Tree Analysis • Risk Assessment • Define Safety Measures • Create Safe Requirements • Implement Safety (we will talk about software specifically here) • Assure Safety Process • Test,Test,Test,Test,Test All of this happens in parallel, not just once per design CSE 466 – Fall 2000 - Introduction - 1
Hazard Identification • Two Approaches • Hazard analysis: start from hazard and work backwards • Ventilator: • Hypoventilation hazard No pressure in air resevoir resevoir vent stuck open (single failure) • Hyperventilation hazard pressure sensor failure overpressure valve stuck closed (double failure) • FMEA: Failure Modes and Effects Analysis, start from failure & work forward • Fuel Cell Example • H2 sensor stuck normal failure to detect internal leak Chassis vent blocked H2 concentration > 45 explosion hazard (double failure) • H2 sensor stuck as if H2 present system shutdown on H2 leak error code no hazard • Single fault tolerance require timing analysis too…is first fault detected before it causes a hazard, and before second fault can happens CSE 466 – Fall 2000 - Introduction - 2
FMEA – Working Forward • Failure Mode: how a device can fail • Battery: never voltage spike, only low voltage • Valve: Stuck open? Stuck Closed? • Motor Controller: Stuck fast, stuck slow? • Hydrogen sensor: Will it be latent or mimic the presence of hydrogen? • FMEA • For each mode of each device perform hazard analysis as in the previous flow chart • Huge search space CSE 466 – Fall 2000 - Introduction - 3
Fault Tree Analysis • Pacemaker Example And gates are good! single fault hazard CSE 466 – Fall 2000 - Introduction - 4
W3 W2 W1 S1 1 - - G1 2 1 - E1 G2 3 2 1 S2 G1 4 3 2 E2 G2 5 4 3 E1 6 5 4 S3 7 6 5 E2 S4 8 7 6 2. Risk Assessment S: Extent of Damage Slight injury Serious Injury* Few Deaths* Catastrophe E: Exposure Time infrquent continuous G: Prevenability Possible Impossible W: Probability low medium high • Determine how risky your system is TUV standard *was “single death” and “several deaths” in source: Hard Time Toy oven: S2*E1*G2*W2 <= 2 CSE 466 – Fall 2000 - Introduction - 5
Example Risk Assessment CSE 466 – Fall 2000 - Introduction - 6
3. Define the Safety Measures • Obviation: Make it physically impossible (mechanical hookups, etc). • Education: Educate users to prevent misuse or dangerous use. • Alarming: Inform the users/operators or higher level automatic monitors of hazardous conditions • Interlocks: Take steps to eliminate the hazard when conditions exist (shut off power, fuel supply, explode, etc. • Restrict Access. High voltage sources should be in compartments that require tools to access, w/ proper labels. • Labeling • Consider • Tolerance time • Supervision of the system: constant, occasional, unattended. Airport People movers have to be design to a much higher level of safety than attended trains even if they both have fully automated control CSE 466 – Fall 2000 - Introduction - 7
4. Create Safe Requirements: Specifications • Document the safety functionality • eg. The system shall NOT pass more than 10mA through the ECG lead. • Typically the use of NOT implies a much more general requirement about functionality…in ALL CASES • Create Safe Designs • Start w/ a safe architecture • Keep hazard/risk analysis up to date. • Search for common mode failures • Assign responsibility for safe design…hire a safety engineer. • Design systems that check for latent faults • Use safe design practices…this is very domain specific, we will talk about software CSE 466 – Fall 2000 - Introduction - 8
5. Implement Safety – Safe Software Language Features Type and Range Safe Systems Exception Handling Re-use, Encapsulation Objects Operating Systems Protocols Testing Regression Testing Exception Testing (Fault Seeding) Nuts and Bolts CSE 466 – Fall 2000 - Introduction - 9
Language Features • Type and Range Safe Systems: Pascal, Ada….Java? Program WontCompile1; type MySubRange = 10 .. 20; Day = {Mo, Tu, We, Th, Fr, Sa, Su}; var MyVar: MySubRange; MyDate: Day; begin MyVar := 9; {will not compile – range error} MyDate := 0; {will not compile – wrong type) • True type safety also requires runtime checking. a[j] := b; what must be checked here to guarantee type safety? range/type of j, range/type of b • Overhead in time and code size. But safety may require this. • Does type-safe = safe? • If no, then what good is a type safe system? CSE 466 – Fall 2000 - Introduction - 10
Guidelines • Make it right before you make it fast • Verify during program execution • Pre-condition invariants • Things that must be true before you attempt to perform and operation. • Post-condition invariants • Things that must be true after and operation is performed • eg while (item!=null) { process(item); item = itemnext; } assert(item == tail) // post-condition invariant • Exception handling What should happen in the event of an exception (assert(false))? who should be responsible for this check? CSE 466 – Fall 2000 - Introduction - 11
Exception Handling • Its NOT okay to just let the system crash if some operation fails! You must, at least, get into safe mode. • Standard C: it is up to the app writer to perform error checking on the value returned by f1 and f2. Easily put off, or ignored. Can’t distinguish error handling from normal flow, no guarantee that all errors are handled gracefully. • a = f1(b,&c) if (a) switch (a) { case 1: handle exception 1 case 2: handle exception 2 … } d = f2(e,&f) if (d) switch (d) { case 1: handle exception 1 case 2: handle exception 2 … } CSE 466 – Fall 2000 - Introduction - 12
Exception Handling in Java void myMethod() throws FatalException { try { // normal functional flow a = x.f1(b); // a is return value, b is parameter d = x.f2(e); // d is return value, e is parameter } catch (IOException ex) { recover and continue } catch (ArrayOutOfBoundsException ex) { not recoverable, throw new FatalException(“I’m Dead”); } finally { finish up and exit } } Exceptions that are thrown, or not handled will terminate the current procedure and raise the exception to the caller, and so on. Exceptions are subclassed so that you can have very general or very specific exception handlers. No errors go unhandled. Separates throwing exceptions functional code exception handling CSE 466 – Fall 2000 - Introduction - 13
Safety of Object Oriented SW • Strongly typed at compile time • Run time checking is not native, but can be built into class libraries for extensive modularization and re-use. The class author can force the app to deal with exceptions by throwing them! class embeddedList extends embeddedObject() { public add(embeddedObject item) throws tooBigException { if (this.len() > this.max()) throw new tooBigException(“List size too big”); else addItem2List(item); } • If you call embeddedList.add() you have three choices: • Catch the exception and handle it. • Catch the exception and map it into one of your exceptions by throwing an exception of a type declared in your own throws clause. • Declare the exception in your throws clause and let the exception pass through your method (although you might have a finally clause that cleans up first). Compiler will make you aware of any exceptions you forgot to consider! • When to use exceptions and when to use status codes or other means? CSE 466 – Fall 2000 - Introduction - 14
More Language Features • Garbage collection • What is this for • Is it good or bad for embedded systems • Inheritance • Means that type safe systems can still have functions that operate on generic objects. • Means that we can re-use commonalities between objects. • Encapsulation • Means the the creator of the data structure also gets to define how the data structure is accessed and used, and when it is used improperly. • Means that the data structure can change without changing the users of the data structure (is the queue an array or a linked list…who cares!) • Re-use • Use trusted systems that have been thoroughly tested • OS • Networking • etc. • We have a project group looking into pros/cons of embedded java CSE 466 – Fall 2000 - Introduction - 15
6. Testing • Unit test (white box) • requires knowledge of the detailed implementation of a single sub-system. • Test local functionality • Control algorithms • Boundary conditions and fault response • Integration Test (gray box) • Distributed processor systems w/ ongoing communications • Subsystems are already unit tested • Primarily for interfaces and component interaction • Falt seeding includes breaking the bus, disabling a subsystem, EMI exposure, power supply fluxuation, etc • Embedded systems require physical test environments • Validation Testing • Complete system • Environmental chamber • More fault seeding, bad user, etc. • Fault Seeding and Regression Testing!!! CSE 466 – Fall 2000 - Introduction - 16
7. Safe Design Process • Mainly, the hazard/risk/FMEA analysis is a process not an event! • How you do things is as important as what you do. • Standards for specification, documentation, design, review, and test • ISO9000 defines quality process…one quality level is stable and predictable. • There are many processes, but the good ones include release/test early and often! Incremental analysis, development, and testing CSE 466 – Fall 2000 - Introduction - 17