1 / 56

Developing Medical Software: Pitfalls and Prophylactics

Developing Medical Software: Pitfalls and Prophylactics . Elliot Jaffe Seminar in Computer Assisted-Surgery, Medical Robots and Medical Imaging Fall 2002. Outline. Why should you be worried? Case Study: Therac-25 US Government Guidelines. What? Me worry?.

hansel
Download Presentation

Developing Medical Software: Pitfalls and Prophylactics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Developing Medical Software: Pitfalls and Prophylactics Elliot Jaffe Seminar in Computer Assisted-Surgery, Medical Robots and Medical Imaging Fall 2002

  2. Outline • Why should you be worried? • Case Study: Therac-25 • US Government Guidelines

  3. What? Me worry? • Software is used in medical devices • Monitoring • Planning • Surgery • Visualization • Software fails

  4. Case Study: Therac-25 • 1983 – 1987 • AECL: Atomic energy of Canada Ltd. • 6 reported “accidents” • Changed the way software is developed and verified as part of a medical device

  5. Medical Linear Accelerators Linac: North Oakland Medical Center

  6. Therac-25 genesis • Therac-6: 6MeV X-ray accelerator • Therac-20: 20MeV Dual Mode (Electron/X-ray) accelerator • Upgraded with Dec PDP-11 minicomputer for ease of use • Could be operated without computer

  7. Therac-25 • Dual Mode 25MeV accelerator • Electron/X-Ray • Can be operated ONLY through the computer • Computer controls and monitors system • Some hardware safety mechanisms and interlocks were replaced with software • First working prototype: 1976 • First commercial product: 1982

  8. Treatment Goals • Deliver high energy radiation for the treatment of cancer • Radiation needs to be focused and controlled • Multiple energy levels • X-Ray • Electron

  9. Therac-25 Operation • Turntable to select from three modes • Visual • Electron • X-Ray • Turntable is moved mechanically • Software monitors position of turntable

  10. Turntable

  11. Operator Interface Cursor should be here during operation

  12. Therac-25: Error States • Treatment Suspend • Requires complete machine restart • Treatment Pause • Operator types “P” to proceed

  13. Therac-25: Error Messages • HTILT, VTILT, etc. • MALFUNCTION <n> • 1 <= n <= 64 • No documentation • No indication of severity • Occurred on average 40 times a day!

  14. Therac-25: Event #1 • June 1985: 10MeV electron treatment • Patient reported: “tremendous force of heat … this red-hot sensation” • Technician replied that it was impossible • AECL claimed it was impossible • Never reported to FDA

  15. Therac-25: Event #1 • Patient received severe radiation burn • Patient’s breast was removed • Shoulder and arm was paralyzed • AECL refused to believe that it was caused by Therac-25 • Lawsuit settled out-of-court

  16. Therac-25: Event #2 • July 26, 1985 • HTILT message, Treatment Pause • Operators resumed treatment • Repeated 5 times until machine stopped • Patient reported “electric tingling shock”

  17. Therac-25: Event #2 • Patient died of cancer • Autopsy revealed that a total-hip replacement would have been required due to radiation exposure • Reported to AECL, FDA • AECL believed it to be a hardware problem

  18. Therac-25: Event #2 • AECL could not reproduce the reported behavior • AECL modified turntable • “Fixed” potential error in 3-bit turntable location identifier

  19. Turntable

  20. Therac-25: Event #2 • AECL claimed: • “analysis of the hazard rate of the new solution indicates an improvement over the old system by at least 5 orders of magnitude”

  21. Therac-25: Event #3 • December 1985 • After upgrade from event #2 • Patient developed parallel striped pattern in treatment area • AECL reported: “Could not have been produced by any malfunction of the Therac-25 or by any operator error.” • Not reported to FDA • Patient required surgery to repair tissue damage

  22. Therac-25: #4 • March 21, 1986 • Operator entered “x” instead of “e” • Moved cursor and corrected error • Began treatment • MALFUNCTION 54 • Continued Treatment • MALFUNCTION 54 • Machine shutdown

  23. Therac-25: Event #4 • Patient monitors: video and audio were broken • Patient received electric shock, started to get up and was then shocked in the arm • Patient pounded on treatment door • Patient sent home • Machine checked out ok

  24. Therac-25: Event #4 • Patient died of overdose 5 month later • AECL suggested an electrical problem in the area • Independent engineering firm checked and found no problem

  25. Therac-25: Event #5 • April 11, 1986 • Same operator • Same editing • MALFUNCTION 54 • Audio monitor (now working) reported a loud sound from machine • Patient died May 1, 1986 (three weeks later) of acute high-dose radiation to his brain

  26. Therac-25: Event #5 • Physicist took machine out of service • Reported to AECL • Operator and Physicist were able to reproduce the failure • AECL still could not reproduce the failure • FDA declares system “defective”

  27. Therac-25: Event #5 - cause • Operating system was a hand-coded real-time system developed by one programmer in the 1970’s. • Problem was traced to race condition in the main loop • Result was that x-ray beam could be used through the electron magnet

  28. Therac-25: Event #6 • January 17, 1987 • Operator set turntable to field light position • Gave command to system to “set” turntable to x-ray • Ran treatment • System reported “no dose or dose rate” • Re-ran treatment • Patient died in April, 1987 of problems related to overdose • AECL and FDA notified

  29. Therac-25: Event #6 - cause • Software bug • Register overflow • 8 bit register used for multiple purposes • Once or twice in each setup phase, the register overflows, allowing the system to think that the turntable was reset

  30. Lessons Learned • Studies reported 12 lessons learned • We will cover five of them

  31. Overconfidence in Software • First safety analysis did not include software, even though it was responsible for safety of the system • When problems did occur, it was assumed to be a hardware failure

  32. Reliability vs. Safety • Therac-25 ran for three years in production without a problem • Tens of Thousands of patients were treated before the first known overdose • Reliability leads to complacency • Reliability != Safety

  33. Lack of Defensive Design • Software was designed for small memory footprint • Self Checks, Error Detection, Error handling and Auditing was left out

  34. Unrealistic risk assessment • First Risk Assessment did not include software • AECL claimed 5 orders of magnitude improvement from changing one microswitch • Software is harder to assess for failures than hardware

  35. Inadequate Software Engineering Practices • Software specification was after-the-fact • Dangerous design/coding practices could have been avoided • Audit trails should be built into the production software • Software should be tested at the unit, module and system level • Regression testing on all changes • GUI should be designed, not implemented

  36. Software Reuse • Therac-25 used software from T-20 • Reliability != Safety • Assumptions and Preconditions may have changed • Sometimes its better to rewrite from scratch

  37. US Government Guidelines • Significantly reduce the risk of death or injury • Impose standards and best practices to raise the overall level of the industry • Define minimum requirements for • New products • Derivative products

  38. Level of Concern • Major: device directly affects the patient or operator and failure could result in death or serious injury • Moderate: device directly affects the patient and failure could result in non-serious injury • Minor: failures will not result in injury

  39. Levels of Concern • Does the software • Control life support device? • Control delivery of harmful energy? • Control treatment delivery? • Provide diagnosis as basis for treatment? • Monitor vital signs? • If no to all these questions, then concern is minor

  40. Requirements for minor concern • Software Description • Device Hazard analysis • Software functional Requirements Specification • Architecture Design chart • Validation, Verification and Testing • Release Version Number

  41. Requirements for Moderate/Major concern • Full Software Requirements Spec. • Design Specification • Traceability analysis • Development lifecycle documentation • Configuration management • Maintenance activities • Revision Level History • Unresolved Anomalies (bugs)

  42. Software Requirements Spec • Hardware requirements • Programming languages • Interface requirements • Software functional requirements • Software performance requirements

  43. Software Requirements Spec • Algorithms for therapy, diagnosis, monitoring, alarms, analysis, interpretation (with supporting clinical data) • Device limitation due to software • Internal software tests and checks • Error and interrupt handling

  44. Software Requirements Spec • Fault detection, tolerance and recovery characteristics • Safety requirements • Timing and memory requirements • Use of off-the-shelf software

  45. Risk/Hazard Analysis Tools • Fault Tree Analysis (FTA) • Used in initial design phase • Failure Modes Effect and Criticality Analysis (FMECA) • Used in design and development phase • Failure Reporting and Corrective Action System (FRACAS) • Used during product lifecycle

  46. Fault Tree Analysis • Identify a failure or safety hazard, then attempt to identify all possible ways to create that hazard • Answers the question: • How can event X occur? • Used in Military and Nuclear Industry since the 1970’s

  47. Fault Tree Analysis: Example Simplified fault tree diagram for an infusion pump

  48. Fault Tree Analysis • Demonstrates that the system will not reach an unsafe state • Identifies areas for improvement • Provides a systematic hazard review

  49. FMEA • Assume a basic defect at the component level, assess the effect and identify potential solutions • Answer the question: • What happens if event X occurs? • Used in Automobile manufacturing

  50. FMEA: Example FAILURE MODE AND EFFECTS ANALYSIS (FMEA) Subsystem/Name: DC motor P = Probabilities (chance) of Occurrences Model Year/Vehicle(s): 2000/DC motor S = Seriousness of Failure to the Vehicle D = Likelihood that the Defect will Reach the customer R = Risk Priority Measure (P x S x D) 1 = very low or none 2 = low or minor 3 = moderate or significant 4 = high 5 = very high or catastrophic

More Related