500 likes | 525 Views
Safety Analysis of Software-intensive Systems. Tor Stålhane IDI / NTNU. What is safety. A system is safe if it behaves in such a way that it does not harms people, equipment or the environment. Safety is a relationship between a system and its environment
E N D
Safety Analysis of Software-intensive Systems Tor Stålhane IDI / NTNU
What is safety A system is safe if it behaves in such a way that it does not harms people, equipment or the environment. Safety is a relationship between a system and its environment Safety is not an add-on to a system but an integrated part that needs to be considered from day one of a development project.
What is safety analysis - 1 Safety analysis is the totality of activities that are used to identify • Hazards that may rise when a system is put into operation. • Ways to remove these hazards or reduce their consequences to an acceptable level. • Actions needed throughout the system’s development to ensure that all safety requirements are implemented.
What is safety analysis - 2 The soft side of safety analysis: Collecting and analyzing info. The problems are human related. • Collecting info from all stakeholders • Organize it in such a way that it can be used to create • Safety requirements for development • Safety tests • Safety routines and procedures for the operation and maintenance of the system
What is safety analysis - 3 The hard side of safety analysis: Defining barriers. The problems are related to both humans, software and hardware: • How can we construct barriers against hazards in the software? • How can we define operating procedures for handling crises?
Collecting info - 1 All stakeholders must be involved in the safety analysis since they all possess vital info. Safety analysis is thus a people intensive process – critically dependent on • The participants’ experience and knowledge. • Our ability to elicit relevant info
Collecting info - 2 We need to identify • All potentially dangerous events - hazards. • The events’ consequences. • The events’ probability or frequency – at least in qualitative terms. • Important scenarios. The quality of the info from a person increases when the questions are related to a scenario.
Tools and methods - 1 The methods that we use in safety analysis – especially in the early phases – must be able to involve all stakeholders. We need methods that are easy to • Learn and understand • Use on real-life problems • Apply to software, hardware, people and routines and procedures.
Tools and methods - 2 Which tools and methods to use depend on who participate in the process, the info available and how it is represented. The info available will depend on where in the development process we are. The way the info is represented is, at least partly, something that we can influence. We have good experience with using UML diagrams in all phases.
Tools and methods - 3 As we move from a concept to a high level design and then on to detailed design and implementation, more and more • Information will be available • Decisions will be made and thus leave us with less freedom when making new decisions. Thus, we will need different analysis methods in different phases of the system’s life cycle.
Knowledge Freedom of decisions Experience Time TD Concept HLD LLD Implementation Project time and decisions
The concept phase Most systems start as a concept, e.g.: • Automatic shut-down of production when we discover a gas leakage. • All patient info kept in a central database and be available for all that need it through a data network. • Complete overview of all our trains – where they are, their speed and so on.
Patient journal system Primary Physician Nurse Physician Lab system Electronic patient journal – Concept Top level view – system and stakeholders
Operational environment Experience Knowledge Experience Knowledge Tools and methods System concept Stakeholder Stakeholder Hazards and barriers
Preliminary Hazard Analysis - 1 The preliminary hazard analysis is used early in the process. This is reflected in the level of details required in the PHA table. We can include both hazards and the corresponding preventive actions – barriers. Barrier descriptions are converted to system requirements.
Requirements Once we have decided to go ahead with the project, we need to elicit and document the requirements. These consist of two components: • The functions used to fulfil the customer’s needs • Barriers against hazards identified in the PHA
Medication Diagnosis Documents Orders and responses Nurse Physician Primary Physician Lab system Treatment plan Use Case for Electronic patient journal
Operational environment Needs Expectations Hazards and barriers Customer Requirements Methods and tools Experience Experience Knowledge Knowledge System concept Stakeholder Stakeholder New hazards and barriers
Safety in the requirements phase Functional requirements – which services should the system offer to its users? Use case diagrams and textual use cases have turned out to be two efficient ways of documenting this. They • Are easy to understand for all stakeholders. • Can be used as input to several safety analysis methods.
Review treatment plan <<threatens>> Review drug data Unlucky doctor Review documents Doctor <<threatens>> Faulty system Review diagnosis Network is down Wrong update Delete data Data is lost Misuse case
High level design When we enter high level design, all identified hazards and barriers have been converted to requirements. The high level design can be documented for instance as • Package diagrams • High level class diagrams • High level sequence diagrams
Patient documents General patient info Treatment plan Patient drug data Patient diagnosis Part of electronic patient journal
Operational environment Experience Knowledge Experience Knowledge Extended requirements Tools and methods System concept Stakeholders Stakeholders Barriers and tests New hazards
Safety and design Packages and classes can be viewed as components and we can thus make our safety analysis much more detailed. Important methods that can be used at this stage are for instance: • HazOp, for architectural design. • Component FMEA
HazOp - 1 HazOp uses study nodes as units of investigation and guide words to help in the hazard identification process. This makes the method quite efficient for identifying hazards On the other hand, HazOp also requires more information – the system’s architecture – to define the study nodes.
HazOp - 2 This is a simple version – more elaborate versions gives more info and requires more work.
Failure Mode Effect Analysis - 1 FMEA will systematically check each system component • How can this component fail? • What are the consequences for the component? • What are the consequences for the system? • How can we handle the hazard?
Failure Mode Effect Analysis - 3 The failure Mode Effect Analysis: • Offers a systematic walk-through of one or more system components. • Focuses on preventions – barriers - rather than cures and fixes. • Produces an easy-to-use list of hazards and ideas on how they can be removed or handled.
Detailed design Just as high level design, the detailed design can be documented for instance as packages, class diagrams and sequence diagrams. We have more info than we had during high level design and we can thus make a more detailed safety analysis.
Patient info Drug DB Patient drug data Treatment plan Patient documents Test results Current treatment If changes necessary Drug description Update drug data
Operational environment Experience Knowledge Experience Knowledge Detailed design Barriers High level design Tools and methods Stakeholder Stakeholder New hazards Barriers and tests
Implementing barriers All hazard analyses must lead to barriers that have one of the following effects: • Prevent a hazard from leading to a problem. • Prevent a problem from causing a dangerous event. • Reduce the effect of a dangerous event if it cannot be prevented.
Reduction Reduce effect of event Prevention Prevent risk from becoming a problem Handling Prevent event from having bad consequences Barrier 6 Barrier 1 Barrier 2 Barrier 3 Barrier 5 Barrier 4 Risk Prob. Event Barrier roles
All barriers work as planned Minimum achievable risk Acceptable risk Unmitigated risk from EUC Barr. n Barr. 3 Barr. 2 Barr. 1 Barrier failure Risk Ru RM RA Barrier reliability
Realizing barriers Barriers in software can be realized in several ways. It is important that they do not lead to a large increase in complexity. One way to realize barriers is to use patterns such as: • Façades or wrapper façades • Protected single channel • Sanity checks on values • Monitor - actuator
Safety analysis research - 1 Research on safety analysis are concerned with some broad problem areas: How to • Implement barriers to prevent or reduce the effect of dangerous events? • Create safety analysis patterns? • Elicit the necessary information from all stakeholders?
Safety analysis research - 2 Our current research in the area of software safety has focused on: • Which methods are the easier to understand, learn and use? • What is the relationship between method and system representation – is it e.g. easier to base an analysis on scenarios than on a requirements list?
Safety analysis research - 3 How can we • Improve the safety analysis by making earlier experiences on similar systems available to all stakeholders? • Most efficiently move from identified hazards to • Prevention, e.g. barriers • Tests – do the barriers work as intended?
Last but not least It is possible to be too safe. A chainsaw with a fully protected blade is • Absolutely safe • Absolutely useless It is not possible to be absolutely safe. Whatever you do or don’t do, the probability of dying during the next hour is more than 10-6. Make sure you have a nice day.