660 likes | 842 Views
Safety Institute – Michael Tooma Presents Incident Reporting, Investigation and Management. Michael Tooma Partner Head of OHSS Asia Pacific Norton Rose Australia. Challenge is that we don’t fall victim to the temptation to oversimplify causes of incidents. Challenge.
E N D
Safety Institute – Michael Tooma PresentsIncident Reporting, Investigation and Management Michael Tooma Partner Head of OHSS Asia Pacific Norton Rose Australia
Challenge is that we don’t fall victim to the temptation to oversimplify causes of incidents. Challenge • The imperative in the aftermath of an incident is to mimise the impact of the incident. That means reducing shut down time associated with damaged equipment, regulatory notices or industrial action. • This often leads to reactive and narrow focused decision making on corrective actions – a new safe working procedure and training course, for example, is the most popular corrective action. • The assumption is that if we identify the cause of the incident, we can simply develop a procedure for addressing it, train workers in the procedure and require them to follow it. • That satisfies the regulator, management, and crucially is cheap (procedures are cheaper than engineering solutions). It fits into a narrative – worker needs instructions and supervision or better instructions and supervision to be safe.
But the world we live in is far more complex than that. Dekker (2011) observes: • “Rational decision-making requires a massive amount of cognitive resources and plenty of time. It also requires a world that is, in principle, completely describable. Complexity denies the possibility of all of these. In complex systems (which our world increasingly consists of) humans could not or should not even behave like perfectly rational decision-makers. In a simple world, decision-makers can have perfect and exhaustive access to information for their decisions, as well as clearly defined preferences and goals about what they want to achieve. But in complex worlds, perfect rationality (that is, full knowledge of all relevant information, possible outcomes, and relevant goals) is out of reach… In complex systems, decision-making calls for judgments under uncertainty, ambiguity and time pressure. In those settings, options that appear to work are better than perfect options that never get computed. Reasoning in complex systems is governed by people’s local understanding, by their focus of attention, goals, and knowledge, rather than some (fundamentally unknowable) global ideal. People do not make decisions according to rational theory. What matters for them is that the decision (mostly) works in their situation.”
Dekker (2011) goes on to explain, that this perfectly normal reaction to the rules being imposed on us at a local level can accumulate at an organisational level with harmful consequences. • He explains: • “Local decisions that made sense at the time given the goals, knowledge and mindset of decision-makers, can cumulatively become a set of socially organized circumstances that make the system more likely to produce a harmful outcome. Locally sensible decisions about balancing safety and productivity – once made and successfully repeated – can eventually grow into unreflective, routine, taken-for-granted scripts that become part of the worldview that people all over the organization or system bring to their decision problems. Thus, the harmful outcome is not reducible to the acts or decisions by individuals in the system, but a routine by-product of the characteristics of the complex system itself.”
Consider the pioneering approach of James Reason in the Swiss Cheese theory – a theory on which most modern incident investigation techniques are based.
So how do we factor that into incident investigations? This is a challenge to the conventional wisdom around incident investigation which is typically concerned with uncovered “the truth” and indeed, more so “the root cause” of an incident. By accepting that faced with the same facts, people will not necessarily behave in the same way.
The theory is that, like holes in Swiss cheese slices, all systems have deficiencies or inadequate defences. • The causal trajectory of an incident leads to an incident when those deficiencies in the system line up. It follows then that proactively, increasing the defence layers, reduces the likelihood of an incident. • It also follows that in attempting to analyses an incident, a better understanding of that trajectory, will uncover the absent or failed defences which enabled the system failure. • Logically, then, in the aftermath of an incident, those system deficiencies are identified, and addressed through corrective actions, reducing the holes on the Swiss cheese slices and therefore reducing the likelihood of a recurrence of the incident.
The Reason model Inadequate or absent defences (e.g. technical and human failures) Active failures (e.g. unsafe acts) Local factors (e.g. task and environmental conditions) Organisational deficiencies and latent failures Example lines of defence
Some holes due to active failures Other holes due to latent conditions Example lines of defence
Incident Incident trajectory
But in complex systems, incident trajectories are often unique. • That is, addressing what went wrong in a particular incident will only help prevent that exact sequence from recurring. \ • But the likelihood of the planets or Swiss cheese slices, aligning in exactly the same way is very remote. It is more likely that the next incident, will involves a difference trajectory and different holes on the Swiss cheese slices. • Addressing the specific sequence that caused the incident will not address potential paths that the incident trajectory could have taken but for certain events.
What if…? Incident Incident trajectory
What if…? Incident Incident trajectory
We often look at an incident sequence, amazed but relieved, that things were not much worse and that they could have been, had it not been for some “lucky” event. • But other than such a casual observation, or a remark in an incident investigation report, little is done about those “other” non-causal events. • That is, events that either prevented the incident from being of greater impact or the possible trajectory that did not occur – the road not travelled but which could have been travelled. • Reason (1990) himself observes that “most of the root causes of serious accidents in complex technologies are present within the system long before an obvious accident sequence can be identified”. That is, the holes are there, if only our investigation techniques could uncover all of them and not just those involved in the incident. Yet most investigation techniques are linear in their approach, seeking out the exact causal sequence – the truth of what happened – and then uncovering the root cause(s) which lead to that factual sequence.
But is it appropriate in a complex world to maintain a linear view of incident causation? • Isn’t the road not travelled just as instructive to further incident prevention as the road actually travelled? • Indeed, in many respects, the “lucky” control is more instructive for incident prevention than the failed or absent control. • If we adopt that approach, building system resilience is not simply achieved by adding Swiss cheese layers as the orthodox view of the theory may suggest but also by uncovering and lugging holes within each layer that are reasonably related to the incident but not causally connected to it.
Why? Incident
Why? What if? Incident
Why? What if? Incident What if?
Why? What if? Incident What if? What if?
Why? What if? Incident What if? What if? What if?
Why? What if? Incident What if? What if? What if? What if?
Why find out about what went wrong? • The proper investigation of incidents is a core part of developing a safety culture. All systems, even the best systems, fail from time to time. • The test of an effective management system is not its lack of failure but rather what is done in the aftermath of that failure. Every incident, no matter how small, represents a learning opportunity. If we properly investigate the incident and get to its root causes – the system failures or deficiencies that permitted the incident to occur - we have a chance to put in place the corrective actions necessary to avert a repeat of the incident and improve the resilience of the system against further failure. • That commitment to constant and continuous reflection, analysis and review is the essence of an effective management system and critical to achieving a positive safety culture within the business or undertaking.
We know that near misses must be investigated. • Failing to investigate near misses can result in learning opportunities being missed and, ultimately, an incident occurring with the attendant loss of life, injury and economic and reputational costs. • The seriousness and the learning opportunities which can be garnered from an event should not be downplayed just because an event does not itself result in injury or damage to plant. To the contrary, near misses present valuable opportunities to learn from mistakes and system deficiencies. In order to avoid disaster it is necessary to understand the risks that arise within an organisation. • To allow this, a culture of reporting must be encouraged within an organisation. Without a reporting culture an organisation will be unable to gather information of incidents that have occurred and will be unable discover the cause of incidents. Underreporting of near misses will hide issues that can be remedied before the problem develops into a disaster.
Why find out about what went right? • What went right in an incident can be just as instructive as what went wrong. • By identifying effective control features, they can be replicated across the system. • Controls that work at a local level – that are accepted by operators and fit into other complex systems – are rare. • Their effectiveness should be celebrated. That is particularly the case in near misses where had not been for those controls, an incident would have occurred.
Indeed, even if what went right was not a control at all but a “lucky event”, an analysis of this may be instructive to the type of controls that might work as a final barrier to the incident causal trajectory. • The reality is we have been attempting to learn the negative lessons from disasters since the inception of safety science as a discipline. • Major disaster report after major disaster report sets out the facts of the incident, the deficiencies in the system, expresses outrage as to how society can allow these conditions to exist, and makes recommendations in relation to safety leadership and safety culture with some specific design recommendations for industry consumption. This was the was in the Columbus, Piper Alpha, Exxon Valdez, BP Texas Refinery, Upper Big Branch and Deepwater Horizon reports, to name a few. The problem with that approach is that it is entirely negative.
If it was that simple to learn the lessons from disasters, surely we would have learnt them by now. • The legal and commercial consequences of failing to do so are very significant globally. We have to assume that most leaders are at worst agnostic towards safety. • Some may not necessarily be passionate about safety but certainly none display the psychotic behavior which would mean that lessons if capable of being easily applied would be ignored. • I have never encountered any managing director who wakes up in the morning wanting to hurt their people, yet even in Australia which prides itself on its safety standards we kill one person every working day on average. Globally the figure is much worse.
The reality is that the lessons from disasters, instructive as they may be, are entirely superficial. Traditional linear incident investigations have limited ability to impact incident prevention because lighting does not strike twice. • As Dekker (2011) observes: • “Reconstructing events in a complex system, then, is nonsensical: the system’s characteristics make it impossible. Investigations of past failures thus do not contain much predictive value for a complex system. After all, things rarely happen twice in exactly the same way, since the complex system itself is always in evolution, in flux.”
The utility of the lessons therefore is translated into motherhood statements about safety leadership and safety culture without any specific means of achieving that in the localized context. • That is not to say that those lessons have no value or meaning. They do. The issue is, can we extract more from our incident investigations. Can we derive practical lessons of real meaning, value and application and can we do this on a regular and systematic basis?
It may be more useful to find out how a potential serious incident became a near miss. Or how consequences of an incident were tampered, rather than just finding out what caused the incident. A better understanding of “what went right” can assist in creating a more resilient system. • The attraction with asking “what went right” is its positive character. • We know that reinforcement is the most important principle of behavior and a key element of most behavior change programs. • We also know that positive reinforcement is far more powerful than negative reinforcement. • We say someone has received positive reinforcement if a desired behavior they display has been followed with a reward or stimulus. • Negative reinforcement is when someone receives punishment, an aversive stimulus or a withholding of a stimulus after displaying certain behavior, usually undesirable behavior.
People are more likely to adjust their behavior to seek out praise and acceptance than out of fear of punishment. We understand this well in our private lives. • When people do things we like, we reward them with expressions of gratitude so that such behavior is repeated. • We avoid rewarding people for negative or undesired behavior but with few exceptions, punishment of people for such behavior will rarely be effective. • Harsh words directed at a fellow motorist who cuts you off on the road is unlikely to alter their driving behavior. That is because they receive an immediate reward for their bad behavior in reaching their destination sooner and the added attention received from you as a fellow motorist, over time, is either ignored or becomes associated with the positive reward – beating traffic.
At work, people do their jobs well every day. • They follow safety procedures. • They engage in safety programs required of them. • For that, we seem to think they deserve no reward or recognition. • Indeed, people strongly believe that it is wrong to reward them for “doing their job” as if to single out that conduct would undermine the integration of safety into operational requirements. • By contrast, if they take a short cut, they are rewarded immediately by being able to do their job faster and depending on the employment terms, either going home sooner, being recognized by your superiors or, in the case of contractors, making more money. Yet we are surprised why over time, people gravitate towards short cuts.
The same is true of managers. • Beyond a certain threshold, improved productivity with no additional innovation or capital expenditure comes at the cost of the health and safety of workers. • But no such distinction is made in relation to managerial recognition and reward. • Managers receive instant positive reinforcement for day to day decisions they make which improve shareholder value such as staff reductions and productivity improvements. • Doing more with less has become a management mantra – a boast of success. That assumes that the status quo has inefficiencies. But where no such buffers exist, the value is derived at the long term expense of current workers and future shareholders. That was the experience the BP Texas refinery.
In that context, it is remarkable to me that when a near miss occurs, we don’t pause to recognise the positive behavior exhibited by people that may have averted a disaster. • That behavior may well be expected because it is consistent with the system, but so is much of every day private behaviors for which we receive an acknowledgement or other positive reaction. • As a society we expect that positive reinforcement in our private lives. We regard it as part of our culture. We drum it into our children. But in our working lives, we seem to take safe behaviour for granted. • In incident investigations, we regard it as irrelevant. How can that be? That is the moment when we are most vulnerable. When we all feel we are to blame. That we have somehow contributed to the incident. Pausing to recognize what we did right can ease much of that. It is about pulling together at a time of crisis.
If the focus of incident investigation remains solely on what went wrong, it is inevitable that it is about blame. Even in organizations where a just culture is in place, the singular focus on the negative behavior can be detrimental to the overall functioning of the system. It is also a missed opportunity. • As Dekker (2011) observes: • “Complex systems can remain resilient if they retain diversity: the emergence of innovative strategies can be enhanced by ensuring diversity. Diversity also begets diversity: with more inputs into problem assessment, more responses get generated, and new approaches can even grow as the combination of those inputs.”
Even if someone did something wrong that worked, we need to understand why it worked so we can capture its positive features. • That, in essence, is what Reason (1997) was describing in the flexible culture component to safety culture discussed above. • The empowerment of well trained workers to make decisions that deviate from normal procedures but that are consistent with the objectives of the procedures. Once those decisions are made, we need to then understand why they worked. That is where asking “what went right” comes in.
Incident Management Note: Slides repeated on following pages
I+0 • Attend to injured • Assure site • Notifications • Mobilise legal • Media • Document management • Preliminary evidence • Regulator liaison • Investigation team I
Incident Notification PCBU Notified • Death of a person • Serious injury or illness of a person • Dangerous incident • Fastest possible means • Telephone or in writing • Keep records for 5 years • Until regulator arrives • Except to attend to injured or render site safe or as directed by police Notifiable incident Preserve incident site Change in approach to incident management
Checklist of notifiable incidents • Did the incident result in the death of a person? • Did the incident cause an injury or illness to a person requiring the person to have immediate treatment as an in-patient in a hospital? • Did the incident cause an injury or illness to a person requiring the person to have immediate treatment for the amputation of any part of their body?
Did the incident cause an injury or illness to a person requiring the person to have immediate treatment for a serious head injury? • Did the incident cause an injury or illness to a person requiring the person to have immediate treatment for a serious eye injury? • Did the incident cause an injury or illness to a person requiring the person to have immediate treatment for a serious burn?
Did the incident cause an injury or illness to a person requiring the person to have immediate treatment for the separation of their skin from an underlying tissue (such as degloving or scalping)? • Did the incident cause an injury or illness to a person requiring the person to have immediate treatment for a spinal injury?
Did the incident cause an injury or illness to a person requiring the person to have immediate treatment for the loss of a bodily function? • Did the incident cause an injury or illness to a person requiring the person to have immediate treatment for serious lacerations? • Did the incident cause an injury or illness to a person requiring the person to have immediate treatment for medical treatment within 48 hours of exposure to a substance?
Did the incident expose a worker or any other person to a serious risk to a person's health or safety emanating from an immediate or imminent exposure to an uncontrolled escape, spillage or leakage of a substance? • Did the incident expose a worker or any other person to a serious risk to a person's health or safety emanating from an immediate or imminent exposure to an uncontrolled implosion, explosion or fire?
Did the incident expose a worker or any other person to a serious risk to a person's health or safety emanating from an immediate or imminent exposure to an uncontrolled escape of gas or steam? • Did the incident expose a worker or any other person to a serious risk to a person's health or safety emanating from an immediate or imminent exposure to an uncontrolled escape of a pressurised substance?
Did the incident expose a worker or any other person to a serious risk to a person's health or safety emanating from an immediate or imminent exposure to electric shock? • Did the incident expose a worker or any other person to a serious risk to a person's health or safety emanating from an immediate or imminent exposure to the fall or release from a height of any plant, substance or thing?
Did the incident expose a worker or any other person to a serious risk to a person's health or safety emanating from an immediate or imminent exposure to the collapse, overturning, failure or malfunction of, or damage to, any plant that is required to be authorised for use in accordance with the regulations? • Did the incident expose a worker or any other person to a serious risk to a person's health or safety emanating from an immediate or imminent exposure to the collapse or partial collapse of a structure?
Did the incident expose a worker or any other person to a serious risk to a person's health or safety emanating from an immediate or imminent exposure to the collapse or failure of an excavation or of any shoring supporting an excavation? • Did the incident expose a worker or any other person to a serious risk to a person's health or safety emanating from an immediate or imminent exposure to the inrush of water, mud or gas in workings, in an underground excavation or tunnel? • Did the incident expose a worker or any other person to a serious risk to a person's health or safety emanating from an immediate or imminent exposure to the interruption of the main system of ventilation in an underground excavation or tunnel?
Did the incident result in any infection to which the carrying out of work is a significant contributing factor, including any infection that is reliably attributable to carrying out work with micro-organisms? • Did the incident result in any infection to which the carrying out of work is a significant contributing factor, including any infection that is reliably attributable to carrying out work that involves providing treatment or care to a person? • Did the incident result in any infection to which the carrying out of work is a significant contributing factor, including any infection that is reliably attributable to carrying out work that involves contact with human blood or body substances? • Did the incident result in any infection to which the carrying out of work is a significant contributing factor, including any infection that is reliably attributable to carrying out work that involves handling or contact with animals, animal hides, skins, wool or hair, animal carcasses or animal waste products? • Did the incident result in Q fever contracted in the course of work involving handling or contact with animals, animal hides, skins, wool or hair, animal carcasses or animal waste products?