440 likes | 453 Views
System Accidents. Accidents happen because. Systems fail in unexpected ways. Failed parts can’t be isolated. In systems, parts have complex interactions that can’t be anticipated. Safety systems can make systems more dangerous. A Story.
E N D
System Accidents SD 142 Prof C.M Burns
Accidents happen because Systems fail in unexpected ways Failed parts can’t be isolated In systems, parts have complex interactions that can’t be anticipated Safety systems can make systems more dangerous SD 142 Prof C.M Burns
A Story In the morning your coffee pot boils dry and the pot cracks. You dig up a spare pot and make another cup of coffee. (You are very much in need of your coffee). You’ve left enough time for class but barely. You rush to class only to realise that you’ve forgotten your apartment keys. Unfortunately you have a co-op interview that afternoon and need to go home to change. Your roommate is in Arts and you won’t be able to find him or her until he/she wanders home at 11pm tonight. Good thing one of your classmates has a spare suit in the 4rth year study room. You talk to your classmate only to learn that he/she took it into the cleaners that morning, there was sale on and they didn’t have an interview for the next couple days. You show up for your interview late, in jeans, it goes really badly. You apologise to the interviewer, explaining... Adapted from Perrow, 1999. SD 142 Prof C.M Burns
The cause of your bad interview is • the mechanical failure of your coffee pot • your human error in deciding to make more coffee and forgetting your keys • external factors in the environment (your roommate’s course schedule and the dry cleaning sale) • the poor design of your apartment door lock • the procedures you used (having coffee at the start of your day, not allowing enough time, etc.) SD 142 Prof C.M Burns
Three Mile Island • Babcock and Wilcox (builders of the equipment) blamed the operators - human error • Metropolitan Edison (the utility) blamed the equipment - mechanical failure • The NRC (Nuclear Regulatory Commission) blamed the design of the system • The operators blamed the procedures • The president’s commission blamed everyone. SD 142 Prof C.M Burns
Why Systems have accidents • All systems have “events”. • An event is an occasion when something does not work according to design due to • failure • ageing • maintenance SD 142 Prof C.M Burns
How events become accidents • Events become accidents when • they go unnoticed • the system is tightly coupled and the event causes other events • The more complex the system is and the more tightly coupled the system is, the more sensitive it is to events. SD 142 Prof C.M Burns
Examples • Virginia Electric Power, 1980 • While a worker was cleaning the floor in an auxiliary building, his shirt caught on a 3 inch handle of a circuit breaker. Pulling it free he activated the breaker, which shut off current to the control rods in the reactor. The reactor shut down automatically and it took 4 days to bring it up again, costing hundreds of thousands of dollars SD 142 Prof C.M Burns
Changing a light bulb in California, 1978 • A worker changing a light bulb on a control panel in the control room dropped the bulb. The dropped bulb created a short circuit in some sensors and controls. The reactor automatically shut down. The loss of the sensors meant the operators could not monitor the plant. The shutdown caused the core to cool too rapidly. The operators came very close to cracking the reactor vessel and causing a major meltdown. SD 142 Prof C.M Burns
What were the 3 fundamental causes of the August 14, 2003 power blackout? SD 142 Prof C.M Burns
Inadequate situation awareness at FirstEnergy Corporation, Ohio. • Inadequate diagnostic support • INADEQUATE TREE TRIMMING Source: U.S.-Canada Power System Outage Task Force, Interim Report SD 142 Prof C.M Burns
Levels of System Analysis • Part: e.g. a valve • Unit: functionally related collection of parts, e.g. a steam generator • Subsystem: an array of units, e.g. a steam generator and its water units • Plant: the collection of subsystems • Environment: everything beyond the plant SD 142 Prof C.M Burns
Incident or Accident • An event is any abnormal operation. • An incident is a failure of a part or unit. • An accident is a failure that results in the loss of a subsystem, the plant, or has an impact on the environment. • A component failure accident is a failure caused by one component • A system accident is caused by the interaction of multiple components SD 142 Prof C.M Burns
Food chain of Accidents 15-30 system accidents 300 incidents 3000 events SD 142 Prof C.M Burns
Food chain of Accidents Grid failure Eastern NA ACCIDENT INCIDENT Power line down EVENT Tree-trimming SD 142 Prof C.M Burns
Quantifying Victims • 1st party victims: the operators and people running the plant • 2nd party victims: non-cooperating personnel (e.g. passengers on a ship) • 3rd party victims: innocent bystanders • 4rth party victims: future generations of humanity SD 142 Prof C.M Burns
Quantifying Victims • 1st party victims: • 2nd party victims: • 3rd party victims: Power Grid Failure • 4rth party victims: Chernobyl SD 142 Prof C.M Burns
Accident causing characteristics of Systems • Linear interactions: a process is carried out in a sequence of steps. One failure affects the entire system “downstream”. • Common mode interaction: One component services two or more parts. Common mode failure. SD 142 Prof C.M Burns
Accident causing Characteristics of Systems • Nonlinear interactions: unexpected complex interactions, e.g. proximity interactions. leak SD 142 Prof C.M Burns
Power Grid Events • Context: August afternoon, moderate loads due to air conditioning, 2 units down • 1:31 another unit goes down, loads are high • 3:05 another line goes down (buffer was now gone) • 2:14-3:59 computer failures in the FE control room (loss of situation awareness) • 3:15-3:39 3 more lines go down due to tree contact • 3:39-3:58 7 more lines trip due to overloading • 4:05-4:08 4 major lines trip due to overloading cascading across eastern North America SD 142 Prof C.M Burns
Power Grid Failure • Linear interaction: 1 minor line trips and then a connected major line trips • Common mode interaction: 1 major line trips and cascades to 2 or more minor lines • Nonlinear interactions: computer failures cause a loss of SA so operators don’t react to line losses in time and the problem cascades SD 142 Prof C.M Burns
Accident causing Characteristics of Systems • complexities in the system: • tight spacing of equipment • proximate production steps • many common mode connections • limited ability to isolate failed components • unintended feedback loops SD 142 Prof C.M Burns
Accident causing Characteristics of Systems • tight coupling: systems that respond very quickly and sensitively to perturbations • contrast with loose coupling: systems that incorporate buffers or slack SD 142 Prof C.M Burns
Recognizing Tight Coupling • more time dependent processes • invariant sequences (X must come before Y) • the process only works in one way (no alternative paths) • little slack, require precise quantities and timing SD 142 Prof C.M Burns
Coupling and Complexity tight nuclear plants dams rail transport chemical plants space missions Coupling assembly lines mining loose post office universities linear complex Interactions Perrow, 1999 SD 142 Prof C.M Burns
Power Grid Failure: System Characteristics • Tight coupling • Tree branch can cause grid failure • No buffer (past 3:05pm that day) • Very precise timing for shutting down lines/reactors • Many common mode connections • Close proximity spots (tree to line) • Difficult to isolate areas on short notice • Many feedback loops SD 142 Prof C.M Burns
Cost of Accidents (US) • Motor vehicle accidents $722 billion/year • Workplace accidents $8.5 billion/year • Home accidents $18.2 billion/year • Public Accidents $12.5 billion/year • Wickens p.352 Table 14.1 SD 142 Prof C.M Burns
Worker’s Compensation • provide income and medical benefits to victims and their dependents • reduce court costs and litigation • encourage accident prevention • study causes of accidents SD 142 Prof C.M Burns
Factors Contributing to Accidents Natural Factors Human, Job, Equipment, Physical Environment,Social Environment factors Hazard or Operator Error Management or Design Error Accident System Characteristics Wickens, p. 357 SD 142 Prof C.M Burns
Employee Factors • age • ability • experience • drugs/alcohol • stress • fatigue • Motivation Wickens p. 358 SD 142 Prof C.M Burns
Job Factors • arousal (boredom) level • physical workload • mental workload • work/rest/shifts • timing • ergonomic hazards SD 142 Prof C.M Burns
Equipment Factors • Controls and displays • electrical/mechanical/thermal/pressure hazards • toxic substances • explosive hazards SD 142 Prof C.M Burns
Physical Environment Factors • Illumination • Noise • Vibration • Temperature • Humidity • Airborne pollutants • Fire and radiation SD 142 Prof C.M Burns
Social/Psychological Factors • Management practices • social norms • morale • training • incentives (motivation) SD 142 Prof C.M Burns
Assessing Hazards • Criticality is a function severity, and probability • Table 14.3, p. 431 shows a criticality scale built out of frequency and severity SD 142 Prof C.M Burns
Table 14.3 SD 142 Prof C.M Burns
FMECA • Failure mode and effects criticality analysis • “A priori” analysis to anticipate failures • From Table 14.4 p. 372 • Component Failure Mode Component Effect Subsystem Effect Criticality Comments • Blade Come loose damage housing other parts 6 • Blade Fracture others loosen uneven cut 4 SD 142 Prof C.M Burns
FMECA • Idea is to try to predict accidents • Estimate criticality of them • Inform redesign or operation of the device SD 142 Prof C.M Burns
Fault Tree Analysis Analyze the events of an accident Modification from text example – stick to events! Operator fails to detect alarm OR Audible warning fails Visual warning fails SD 142 Prof C.M Burns
Hints on Doing Fault Tree Analysis • Past accident: Use AND • Anticipated Accident: AND/OR • Start at the Top • Top is the last event • Follow chronologically SD 142 Prof C.M Burns
Exercise: FTA • Within a group of four, use fault tree analysis to • explain the coffee pot incident from the start of class SD 142 Prof C.M Burns
A Story In the morning your coffee pot boils dry and the pot cracks. You dig up a spare pot and make another cup of coffee. (You are very much in need of your coffee). You’ve left enough time for class but barely. You rush to class only to realise that you’ve forgotten your apartment keys. Unfortunately you have a co-op interview that afternoon and need to go home to change. Your roommate is in Arts and you won’t be able to find him or her until he/she wanders home at 11pm tonight. Good thing one of your classmates has a spare suit in the 4rth year study room. You talk to your classmate only to learn that he/she took it into the cleaners that morning, there was sale on and they didn’t have an interview for the next couple days. You show up for your interview late, in jeans, it goes really badly. You apologise to the interviewer, explaining... Adapted from Perrow, 1999. SD 142 Prof C.M Burns
Bad interview AND Late Jeans AND No suit Spare at cleaners Can’t find roomate AND No suit AND No keys Suit in apt AND No keys Door locks Coffee pot etc. SD 142 Prof C.M Burns
Bad interview AND Late Jeans AND No suit Spare at cleaners Areas for improvement through design Can’t find roomate AND No suit AND No keys Suit in apt AND No keys Door locks Coffee pot etc. SD 142 Prof C.M Burns