1 / 22

Fault Tolerant Systems in a Space Environment

Explore error detection techniques like watchdog processor, control flow error detection, fault injection, and types of signatures in fault-tolerant systems deployed in space environments. Learn the importance of fault avoidance and fault tolerance in space missions.

ckaplan
Download Presentation

Fault Tolerant Systems in a Space Environment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fault Tolerant Systems in a Space Environment EE585: Fault Tolerance Computing

  2. Overview • Introduction • Error Detection Technique. *Watchdog Processor *Control Flow Error Detection. *Types of Signatures. • Fault Injection. • Conclusion. EE585: Fault Tolerance Computing

  3. Introduction • Experimented by CRC on Advanced Research and Global Observations Satellite.(ARGOS) • The approach mainly focuses on Space missions involving equipment that combines the two basic approaches of Fault Avoidance and Fault Tolerance • Mainly uses Software Techniques for detecting errors. EE585: Fault Tolerance Computing

  4. Error Detection Techniques • Watch dog Processor It is a small processor that sits on buses , passively observes the bus transactions generated by main processor and detects errors by monitoring. EE585: Fault Tolerance Computing

  5. Watchdog Processor EE585: Fault Tolerance Computing

  6. Control Flow Error Detection • Main goal is to check the correct sequencing of the instructions. • Done by Signature Analysis. It is a method in which signature is associated with a block of instructions and saved at compile time. During runtime, generated signature is compared with saved ones and errors are detected. EE585: Fault Tolerance Computing

  7. Types of Signatures • 1. Path Signature Analysis: * Signatures are computed for sequence of nodes, i.e., paths rather than single node. * Two bits are used to differentiate signatures * A special tag signals the time to compare the computed signature with embedded one. • 2. Signature Instruction Streams (SIS) EE585: Fault Tolerance Computing

  8. Contd…. EE585: Fault Tolerance Computing

  9. Contd… • Paths are grouped into sets and each set has a signature, called justifying signature. • Control flow diagram of three basic blocks EE585: Fault Tolerance Computing

  10. 2.Signature Instruction Streams (SIS) EE585: Fault Tolerance Computing

  11. Contd… • To reduce number of signatures embedded in the code, Branch Address hashing is used. EE585: Fault Tolerance Computing

  12. Branch Address Hashing EE585: Fault Tolerance Computing

  13. Stutter Step Mode (SSM) • Each group of instructions is executed twice or more and the results are compared. It detects errors missed by other techniques. • Disadvantages: * Performance level is lowered. * Memory overhead. EE585: Fault Tolerance Computing

  14. Application of SSM to one instruction • Overhead is 300% EE585: Fault Tolerance Computing

  15. Contd… • Reduced overhead by extending duplication to a basic block. EE585: Fault Tolerance Computing

  16. Error Masking in SSM EE585: Fault Tolerance Computing

  17. Contd… • Assume, values of registers B= 10 C= 7 => A= 17 D= 3 (We know the result of dividing any number between 19 and 15 by 5 is 3.) • Say if A= 18 (instead of 17), the error is not detected. • Therefore, we need to be careful in selecting the error detection technique. EE585: Fault Tolerance Computing

  18. Fault Injection • One way to validate Fault tolerance mechanisms • Advantages: 1. Flexibility 2. Controllability 3. Predictability • Disadvantages: 1. Its questionable whether the injected faults are good representation of faults in real environment. EE585: Fault Tolerance Computing

  19. Contd… • In ARGOS, system is tested in Space environment created. • Different approaches to fault injection in electronic systems: 1. Disturb the signals on the pins of the pins. 2. Radiation. 3. Power Supply Disturbance. 4. Logic simulation. EE585: Fault Tolerance Computing

  20. Conclusion • Determined the tradeoffs between fault tolerance and fault avoidance techniques and finally come up with an efficient blend of technique suitable. • Hardware and Software fault tolerance techniques are studied. EE585: Fault Tolerance Computing

  21. References • Fault Tolerant Systems in a Space Environment. - Philip P.Shirvani and Edward J. McCluskey. (Stanford University) • http://www-crc.stanford.edu/crc_papers/CRC-TR-98-2.pdf EE585: Fault Tolerance Computing

  22. Queries? EE585: Fault Tolerance Computing

More Related