660 likes | 1.35k Views
Human Error From Taking Risk to Running Risk. Prof Patrick Hudson Centre for Safety Studies Department of Psychology Leiden University . Introduction - Structure. Two Types of Risk Case studies Piper Alpha & Herald of Free Enterprise Human Error The Organisational Accident Model
E N D
Human ErrorFrom Taking Risk to Running Risk Prof Patrick Hudson Centre for Safety Studies Department of Psychology Leiden University
Introduction - Structure • Two Types of Risk • Case studies • Piper Alpha & Herald of Free Enterprise • Human Error • The Organisational Accident Model • Examining the sources of risks • Case study DAL 39 • Solutions to human error • What to look for • Conclusion
Where am I coming from? • Psychology • Why do people do what they do? • Human error • How can people get things so wrong? • Oil and Gas industry, Aviation & Medicine • Extremely high hazard industries • The organisational model of accidents • Reason’s Swiss Cheese Model
What is safety all about? • Preventing harm to people • Safeguarding assets • Protecting environment • Preserving reputation • If things didn’t go wrong it would be easy • Safety and profits are about risk management
Managing risks • Safety is about managing risks to people, the environment etc - what risks do you take? • The alternative is to run the risks and hope for the best - can we run the risks? • What happens to companies that run risks? • The best make profits, the worst go bankrupt • So, we need to have a risk management process - we need understanding of the types of risk and where they come from
Risks • We can distinguish two ways to approach risk • We take a risk • We can decide the return is worth it • We run a risk • We can become victims if things go wrong • People who take risks are not always the same as those who run them
Case StudyPiper Alpha • A major disaster • Changed the way Oil and Gas industry operates • Created the requirements for Safety Management Systems and Safety Cases to be ‘living system’ and ‘living document’ • Had legal effects as far as Australia
Piper Alpha Disaster • In July 1987 the Piper Alpha platform was destroyed with 167 fatalities • The immediate cause was leaking gas condensate • The disaster was made worse by a total failure of defences • By 1990 Occidental was out of business in the UK
Why do accidents happen? • Accidents are quite infrequent • An accident is often seen as being caused by one or more individuals • But ---- • In Piper alpha the major problems were the platform design and the permit to work system • Piper Alpha had also been audited and passed by the regulator 7 days earlier
What were the risks? • Many people died because they followed procedures • The platform management failed to provide a safe workplace • The regulator had failed to audit the system
Case StudyHerald of Free Enterprise • Herald of Free Enterprise sank outside Zeebrugge harbour • The Assistant Bosun was asleep • The bow doors were still open • 186 people died
Herald of Free Enterprise TRIMMING PROBLEM SHIP HEAD DOWN MANAGEMENT HIGH BOW WAVE NO CHECKING SYSTEM 15 MINUTES EARLIER ACCELERATION 5 MINUTES LATE CAPSIZE CHIEF OFFICER LEAVES G-DECK MASTER ASSUMES SHIP READY LOADING OFFICER ASSISTANT BOSUN DOORS OPEN DOOR PROBLEM ASSISTANT BOSUN ASLEEP BOSUN NO INDICATION
Herald Analysis • The assistant bosun was overworked • The masters had asked for indicators • The management had refused on grounds of cost • A Townsend Thoreson vessel left Dover with the bow doors open the next day!
Active vs Latent Failures • Analysis of disasters indicates the need to distinguish two types of human failure • Active Failures - Errors and violations that impact directly on the system and victims • Latent Failures - Accidents waiting to happen
From Error to Underlying Cause Latent Conditions Active Errors Slips Planning Design Procedures Unintended Actions Lapses Decisions Unsafe Acts Mistakes Training Planning Communication Accountability Intended Actions Violations Latent Conditions
Types of risk • The individuals making the active failures are frequently running the risks • Those accepting the latent failures are those who have taken the original risk • They expect that all will go well • Weaknesses in the system allow problems to happen • The unsafe acts of individuals are the obvious human errors - running risks
The Causes of Incidents • Triggers • Defences • Unsafe Acts • Preconditions • Underlying Causes • Decisions made ImmediateCauses Underlying Causes
Why do Accidents Happen? • Equipment • Breakdowns • Doesn’t work • People • Incompetence • Sloppiness • Risk Taking • Organisation • Allowing failures to propagate • Accidents waiting to happen
Latent Conditions = Underlying Causes • Latent Conditions represent accidents waiting to happen • Many problems are to be found. E.g.: • Poor procedures (Incorrect, unknown, out of date) • Bad design accepted • Commercial pressures not well balanced • Organisation incapable of supporting operation • Maintenance poorly scheduled • Latent conditions make errors more likely or the consequences worse • Individuals are the recipients of somebody else’s problems • Taking a risk involves accepting latent conditions, running the risk involves becoming a recipient of those problems
Classifying Latent Conditions • We can group underlying causes - Whys • Hows refer to the immediate causes • Underlying causes refer to the organisational level • Concentrating on why means we no longer concentrate upon individuals • The categories are dependent upon what you are going to do with the information
Preconditions • The reasons why an individual or group may make an error • Preconditions influence the probability • There are few effects of individual differences (accident proneness does not exist) • Preconditions that induce or make errors more likely are the result of (failure to) control • The question is: Why are the preconditions for error present?
Preconditions II • Haste • Ignorance • Design • Unusual situations • Fatigue • Habit • “Strong but Wrong” • These are the symptoms of s deeper problem
Reason’s Swiss cheese model ofaccident causation Some holes due to active failures Hazards Other holes due to latent conditions Losses Successive layers of defences, barriers, & safeguards
Barriers or Controls Hazard/ Risk WORK Undesirable outcome HSE Management Taking risks Running risks
Shell’s Bow-tie Concept Events and Circumstances Harm to people and damage to assets or environment BARRIERS HAZARD CONSEQUENCES Undesirable event with potential for harm or damage Engineering activities Maintenance activities Operations activities
Case StudyDAL 39 Schiphol • An example of multiple failures • The criminal appeal found that the 3 Air Traffic Controllers were guilty of an infringement • There was no punishment (so no further appeal) • Consider what the conventional and actual risks were • Would you have spotted these? • Would they appear in a conventional risk analysis?
DAL 39 • A Delta 76 aborted take-off at Amsterdam Schiphol on discovering 747 being towed across the runway • Reduced visibility conditions (Phase - B) • The tower controller was in training, under the tower supervisor • There was another trainee and of the 11 people in the tower five were changing out to rest • The incident happened between the inbound and outbound morning peaks
R-83 Route KLM B747 G-3 Route DAL39 Fairway Runway 06/24 Hangar 11
DAL 39 continued • The marshalling vehicle called in unexpectedly as Charlie-8 with a towed KLM 747 from a parking apron • Radio communications were unclear and C-8 did not state exactly where he was • C-8 was given clearance • The stopbar light control box confused everyone in the tower (it was a new addition) • The controller, thinking that the tow had crossed successfully, gave DAL 39 clearance • The DAL pilots saw the 747 and stopped in time
DAL 39 Initial Analysis • Tow failed to report exact position or destination • Tow not announced in advance (as per procedures for phase B) • Assistant ATCo believed tow from right to left (did not know that a tunnel was in use) • Controllers completely unfamiliar with new control box • Ground radar pictures set up to cover different arrival and departure runways meant tow not visible on one screen • Controller was meshing the tow between both take-offs and landings • The tow, given clearance 1m 40 sec earlier, started off once the stopbars went out
Why did all this happen - 1? • Tow was in violation, but this appears to be routine • No clear protocols for ground vehicles and no hazard analysis • Different language for aircraft (English) and ground vehicles (Dutch) • Poor quality of ground radio • Clearances appeared to be unlimited once given • Tower supervisor was also OTJ trainer in the middle of the rush hour • Altered control box not introduced to ATC staff
Why did all this happen - 2? • No briefings about alterations at Schiphol (It has been a building site for years) • Too many trainees in the tower in rush hour under low visibility conditions • Differences in definition of low visibility between aerodrome and ATC • No management apparent of the change in use of the S-Apron • No operational audits by LVNL or Schiphol, of practice as opposed to paper • Schiphol designed requiring crossing and the use of multiple runways for noise abatement reasons
The DAL 39 event scenario Tunnel brought into use without briefings Pilots see 747 and abort take-off Routine violation of tow procedures Airport structure Airport decides to change airport structure Tower combining training and operations during difficult periods Controller gives clearance without assurance of tow position
How can we manage errors? • Risks refer to things that can go wrong • Errors represent ways in which people can fail to control the hazards • An inspector/auditor should be looking at two levels • Are the standards being adhered to? • Are the standards appropriate? • Have any hazards been missed or managed ineffectively?
Error Management Avoid Reduce Learn Identify Support Check
Error management and inspection • We can uncover problems from a wide range of sources of information • Accidents • Near misses • History • Brainstorming • We can see if the best control methods are being applied • If we leave everything to the individual we have already created major problems
Error Management II What Why Avoid What Why How Who Where When What Why How What Why How Who Where When Reduce Learn Identify How Who Where When Support Who Where When Check
Safety Management andSafety Culture • The level of safety management is a function of the organisational safety culture • Individuals may do their best, but that may not be enough • Is the organisation organised and systematic? • Are they satisfied with their performance, or do they feel they could do better?
The Evolution of Safety Culture GENERATIVE safety is how we do business round here Increasing Informedness PROACTIVE we work on the problems that we still find CALCULATIVE we have systems in place to manage all hazards REACTIVE Safety is important, we do a lot every time we have an accident Increasing Trust & Accountability PATHOLOGICAL who cares as long as we’re not caught
The Edge The Edge Normally Safe Inherently Safe No need Return on Capital Invested 6% 10% 15% Normally Safe The Edge Safety Management Systems Safety Culture