1 / 45

Reliable State Machines

Reliable State Machines. Dr. Gary R Burke California Institute of Technology Jet Propulsion Laboratory. outline. Background JPL MER example JPL FPGA/ASIC Process Procedure Guidelines State machines Traditional Highly Reliable Comparison. MER Mission example. Large number of FPGAs

mahsa
Download Presentation

Reliable State Machines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reliable State Machines Dr. Gary R Burke California Institute of Technology Jet Propulsion Laboratory Caltech

  2. outline • Background • JPL MER example • JPL FPGA/ASIC Process • Procedure • Guidelines • State machines • Traditional • Highly Reliable • Comparison Caltech

  3. Caltech

  4. Caltech

  5. Caltech

  6. MER Mission example • Large number of FPGAs • Mostly fuse programmable – but at least one RAM programmable FPGA • Several ASICs • Many standard parts eg Microprocessor, RAM chips. Caltech

  7. Caltech

  8. Caltech

  9. Caltech

  10. Caltech

  11. Caltech

  12. Caltech

  13. Caltech

  14. FPGA/ASIC Process • JPL needs to ensure design process is sound • A bug in an FPGA/ASIC can halt a billion dollar mission • Tight schedules can result in inadequate testing • Inadequate version control can result in the wrong code • First Pass success important for ASIC design Caltech

  15. FPGA/ASIC Process • To ensure a quality product: • Requirements are correct and do not change • Specification is complete • Design will meet the specification and requirements • Testing has covered all possible cases Caltech

  16. FPGA/ASIC Process • Peer reviews by experts to check the design and design approach • Formal Reviews to ensure design process is adequate, and to sign off on the design • Documentation for review and archiving • Check-lists to ensure all problems are fixed Caltech

  17. FPGA/ASIC Process • Configuration Management to ensure correct versions are used • Verification Matrix – which documents all testing • Checking tools e.g. Lint, DRC; all errors, and warnings documented Caltech

  18. ASIC PROCESS Caltech

  19. FPGA PROCESS Caltech

  20. Guidelines • Define set of rules for HDL design • Reduce ambiguity • Clarify design to be easily checked and reviewed • Implement most reliable design techniques Caltech

  21. Fault Tolerant State Machines • The state machine needs to be tolerant of single event upsets • State machine should not hang • State machine should always be in a defined state • No asynchronous inputs to state machine • Default state must be specified Caltech

  22. State Machines • A state machine is a sequential machine that when built into an FPGA or ASIC controls the sequencing of actions in the digital logic • The current state of a machine is held in a state register which is updated on a clock • The next value of the state register (next state) is derived from the current state and the inputs • Outputs from the state machine are decoded from the state register and can also be combined with the inputs Caltech

  23. State-Machine (SM) Encoding • Each distinct state of the SM is represented by a unique code • The allocation of these binary codes to states is the Encoding • The simplest encoding is Binary • In Binary encoding each state is given the next available binary number in sequence. Caltech

  24. Other SM Encoding • 1-hot encoding • The number of bits in the code is equal to the number of states. Each encoded state has just 1 bit in the encoded word set to a 1 (the rest are 0) • The advantage is that when optimized for non-reliable use, the amount of logic needed is less than Binary encoding, and it can be faster. One bit change with a SEU will result in a bad code which can be detected. • The disadvantage is the increased number of bits results in more flip/flops and therefore more targets for SEUs. The SEU advantage is lost when the 1-hot encoding is optimized. • The simplest encoding is Binary • In Binary encoding each state is given the next available binary number in sequence. Caltech

  25. Other SM Encoding- cont • Grey-code • Similar to binary encoding, except the codes are chosen so that in the main state-machine sequence only 1 bit changes at a time • No major advantage over binary with this code. Decoded outputs from the state register can make use of the nature of the encoding to simplify producing a glitch free output. Caltech

  26. Other SM Encoding- cont • H2-code • This variation on Binary encoding uses one extra bit to ensure all codes are separated by a Hamming distance of 2. That is, it will take 2 changes in the state register to reach another known state. • The advantage is that it has less bits and so less SEU targets than 1-hot, but retains the fault tolerance of the un-optimized 1-hot encoding. Caltech

  27. Other SM Encoding- cont • H3-code • This extension on H2 encoding uses additional bits to ensure all codes are separated by a Hamming distance of 3. That is, it will take 3 changes in the state register to reach another known state. • The advantage is that the SM can be designed such that a single change in the state register has no effect on the state. • The disadvantage is that it requires more logic to implement Caltech

  28. Synthesis • To check the overhead of each of the state machines, they were individually synthesized • Finite state machine optimization is turned off • A clock frequency of 50 MHz is used • Target device is a Xilinx Spartan 2, speed grade 6 • Error injection circuitry is not included Caltech

  29. Synthesis Results Caltech

  30. Four Bit State Encoding Caltech

  31. Eight Bit State Encoding Caltech

  32. Twelve Bit State Encoding Caltech

  33. Sixteen Bit State Encoding Caltech

  34. Twenty-Four Bit State Encoding Caltech

  35. Thirty-Two Bit State Encoding Caltech

  36. Fault Injection Test • A test circuit is generated with an example of each state machine executing the same task, plus a reference state machine • The task chosen requires a16-state state machine, to detect a 16-bit pattern in a serial input stream • An error generator injects faults into all state machines except the reference state machine Caltech

  37. Error Injection Test Continued • The outputs of each state machine are compared to the reference output • A set of counters tallies the comparison outputs • 2 types of failure are logged for each state machine: • Failure to detect pattern • False detection of pattern (false-positive) Caltech

  38. Error Injection Test Continued • Non-key patterns are 1-bit different from the key pattern, to increase the likelihood of a false match • Error rate can vary, set to 1:199 clocks in example • Errors are weighted by distributing them pseudo-randomly over 16 bits. A state machine with a word size of n, receives n/16 of the total faults • Synchronous fault injection is before the state register • Asynchronous fault injection is after the state register • All results are from actual implementation of the test circuits in a Spartan 2 FPGA Caltech

  39. Error Rate – Synchronous Faults Caltech

  40. Error Rate – Asynchronous Faults Caltech

  41. Error Rate – Asynchronous Pulse Faults Caltech

  42. Results: Binary Encoding • Lowest resources used • Second fastest speed after One Hot • Fastest for small number of states • Second-most sensitive to errors • Generates false-positive errors i.e. reports false pattern matches Caltech

  43. Results: One Hot Encoding • No false-positive errors (single faults) • Fastest speed except for small number of states and large number of states • Uses more resources than Binary • Inefficient for large number of states • Worst fault tolerance of all encoding tested • Has 2x the error rate of binary encoding Caltech

  44. Results: Hamming Distance of 2 (H2) Encoding • No false-positive errors (single faults) • Better Fault Tolerance than Binary • More resources needed than One Hot, except for large number of states Caltech

  45. Results: Hamming Distance of 3 (H3) Encoding • Zero single-fault errors • Immune to synchronous and asynchronous errors • Lowest double-fault errors • Most resources used (*) ~2x binary encoding • Slowest speed (*) (*) Except for large number of states Caltech

More Related