1 / 19

Analytical Approach for Soft Error Rate Estimation of SRAM-Based FPGAs

Analytical Approach for Soft Error Rate Estimation of SRAM-Based FPGAs. Test & Reliability Group (TRG) Department of Electrical & Computer Engineering Northeastern University. Problem Statement. Estimating soft error rate in FPGAs The probability of system failure Due to soft errors

johnda
Download Presentation

Analytical Approach for Soft Error Rate Estimation of SRAM-Based FPGAs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analytical Approach for Soft Error Rate Estimation of SRAM-Based FPGAs Test & Reliability Group (TRG) Department of Electrical & Computer Engineering Northeastern University

  2. Problem Statement • Estimating soft error rate in FPGAs • The probability of system failure • Due to soft errors • For a given mapped design • Mean time to manifest a corrupted conf. bit • To primary outputs or Flip-flops

  3. Motivation • Need for soft error rate estimation • Exponential growth of vulnerable bits due to Moore’s law • High cost of Error tolerant schemes • To make appropriate cost/reliability trade-offs • Where to put redundancy • Previous work: Fault Injection • Time-consuming / Incomplete / Expensive • Needs physical prototype board • Cannot be used in design phases • Prototype board can be damaged  Hard Error

  4. Error Models in FPGAs • Memory resources: • User bits • Flip-flops, RAMs, … • Configuration bits • Mux select bits, LUT bits, PIPs, … • User bits  Transient errors • Config. bits  Permanent errors

  5. SER Estimation • Traversing structural paths [Asadi04] • From error sites to outputs

  6. SER Estimation in ASIC Designs • S(n): System failure probability (SFP) vector • Si: SFP given node i erroneous • n: total fault sites • Experiments on ISCAS89 show that: • Three order of magnitude faster • Compared to random-input simulation • Accuracy: more than 90%

  7. FPGA vs. ASIC in SER Estimation • ASIC: transient error • Only requires propagation probability • FPGA: both transient & permanent errors • Transient errors: the same • Permanent errors: needs activation as well • More error sites in FPGAs • Routing signals

  8. FPGA vs. ASIC in SER Estimation • Nodes with different error rates in FPGAs • No attenuation in FPGAs • During propagation

  9. SER Estimation of FPGAs: Steps • Compute permanent error rates for all nodes • PRi : the permanent error rate of node i • n: total number of fault sites • Compute netlist failure probability vector • Ni= failure prob. given node i erroneous • System failure rate vector (S) = PR  N • Si = PRi  Ni

  10. How to Compute Ni? • Open & stuck-at errors: • Ni = [SPi  PPi(0) + (1-SPi)  PPi(1)] = PPi • PPi: Propagation prob. (the method used for ASIC) • SP: Signal probability is used for activation prob. • Bridging wired-AND & wired-OR error (nets i and j): • Ni (Wand)= [SPi(1-SPj)PPi(0)] + [(1-SPi) SPjPPj(0)] • Ni (Wor)= [SPi(1-SPj)PPj(1)] + [(1-SPi) SPjPPi(1)] • LUT bit-flip: • Ni = Activation prob. (cell)  Prop. Prob. (LUT output)

  11. How to Compute PRi? • PR(n): permanent error rate vector • PRi : r  f • r: Raw error rate of an SRAM cell • f: Number of all possible errors at node i • n: total number of error sites • PRAB= 6  r

  12. System Failure Rate • For the first clock: • For c clock cycles: • The same probability is valid for the next clock cycles • c: Number of clocks checking the state of the circuit • After particle hit

  13. Error List • Mux-open • PIP open • Buffer off • A bit-flip in LUT • Control/clocking bit-flip

  14. Experimental Setup • Xilinx Virtex 300 (XCV300) • Xilinx Design Language (XDL) • Benchmark: some ISCAS89 circuits • r = raw failure rate for an SRAM cell • r=0.01 FIT/bit • 1000 clocks executed for each SEU • Platform: Sun Solaris Ultra-10 • 256 MB Main Memory

  15. Results: Sensitive Bits Number of sensitive SRAM bits for each part

  16. Results: SFR & Estimation Time System Failure Rate & Estimation Time Number of Clock cycles: 1000 SP Time: Signal Probability computation time SFR Time: System Failure Rate computation time

  17. Results: Manifestation Time Mean Time To Manifest (MTTM) errors to outputs (Results are in terms of cycles)

  18. Summary & Conclusions • A new approach for SER estimation • For SRAM-based FPGAs • No physical implementation required • Can be used in early design stages • Very fast simulation time • Can cover all possible faults • Mean Time To Manifest errors to outputs: • MTTM(Control/clocking) < MTTM(routing) • MTTM(routing) << MTTM(LUT)

  19. Questions? Thanks

More Related