1 / 21

SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA

SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA. Mandy M. Wang JPL R&TD Mobility Avionics. Agenda. Project Background SEU Sensitive Areas and Mitigation Approaches Design Details Conclusion. Project Objective.

ashlyn
Download Presentation

SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics D/MAPLD 2004

  2. Agenda Project Background SEU Sensitive Areas and Mitigation Approaches Design Details Conclusion D/MAPLD 2004

  3. Project Objective Mobility Avionics project aims to develop an embedded platform for space flight instruments and systems that is scalable, configurable, and capable of withstanding low to medium radiation environments. D/MAPLD 2004

  4. Multi-Tiered Strategy Science Data Processor Orbiter Command Data Handler Not Time Critical Image Processor Micro-Mobility Controller Motor Control Simple Strategy Robust Strategy Science Data Processor Time Critical EDL Controller Always Available Strategy Ground Support Equipment Mission Critical Not Mission Critical D/MAPLD 2004 Low to Medium Radiation Tolerance is Assumed

  5. Strategies Simple Strategy:A quick-and-dirty approach. It uses less than desirable techniques such as device reset and reconfiguration as a means of error correction. It may require an external computer for configuration check. Robust Strategy:A refinement of the simple strategy. It uses a SEU immune FPGA as a monitoring device for the system board base on Xilinx FPGA device. As a result, no external computer is needed. D/MAPLD 2004

  6. SEU Sensitive Areas • Xilinx Virtex-II Pro SEU sensitive areas include: • PPC405 Core registers • Configuration Memory (LUT equation and Routing) • Data path Registers • User Memory • (Block or Distributed RAMs) (XC2VP20) Normalized Data – based on predicted upset rates D/MAPLD 2004

  7. Mitigation Approaches D/MAPLD 2004

  8. System Design - Overview Serial Port Decoder (Injects fault Signals) FI PPC405 1 PPC405 2 EXT MEM (128MB) EDC FI OCM BRAM (8K) PLB2OPB Bridge UARTs C DDR SDRAM Cntl FI FI FI FI EDC PLB ARB OPB ARB EDC Controller FI Status BRAMs (4K) PLB BRAMs (Firmware) (32K) Crit. INTC Non-Crit INTC (External Devices) D/MAPLD 2004

  9. Dual-processor Comparator PPC 405 Block 1 PPC 405 Block 2 Off Chip Area Cache Units MMU CPU Timers and Debug Cache Units MMU CPU Timers and Debug External SDRAM PLB IPIF PLB IPIF FI FI FI FI FI FI C FI FI DDR SDRAM Controller PC Arbiter PLB Bus Note: Yellow lines: PLB master read / write signals for D-Cache Green Lines: PLB master read signals for I-Cache FI : Fault insertion point PC : Parity Check D/MAPLD 2004

  10. Dual-Processor Voting Simulation D/MAPLD 2004

  11. EDAC OCM BRAMs (Read/Write) • Hamming Code [32,39] • Read-modified-write to support byte enable feature • Error information is stored in a separate memory space • Single-bit error triggers a CPU interrupt • Double-bit error triggers a CPU reset Data Out (discard parity bits) 32 PPC405 #1 FORCE ERROR PARITY_OUT Glue Logic ENCIN Parity Encoder 32 7 32 ADDR BRAMS (8KB) EN ENOUT DECIN W_EN[3:0] Error Detection Correction 32 32 7 DECOUT CLK PARITY_IN PPC405 #2 ERROR D/MAPLD 2004 Xilinx XAPP645

  12. EDAC PLB BRAMs (Read Only) • Hamming Code [64,72] • Read-modified-write to support byte enable feature • Single-bit error is stored in a separate memory space • Single-bit error triggers a CPU interrupt • Double-bit error triggers a device reconfiguration Data Out (discard parity bits) 64 FORCE ERROR 2 PLB Interface PARITY_OUT ENCIN Parity Encoder Glue Logic 64 8 ADDR 64 Processor Local Bus BRAMS (32KB + 8 KB) EN ENOUT W_EN DECIN PLB BRAM Controller Error Detection Correction 64 64 DECOUT 8 CLK PARITY_IN 2 ERROR D/MAPLD 2004 Xilinx XAPP645

  13. EDAC DDR SDRAM • Hamming Code [64,72] • Read-modified-write to support byte enable and burst of 2-words features • Single error is stored in a separate memory space • Single error triggers a CPU interrupt • Double error triggers device reconfiguration Data Out (discard parity bits) 64 32 Mux FORCE ERROR 2 PARITY_OUT ENCIN 8 Glue Logic Parity Encoder 64 4 Mux DDR SDRAM (128MB + 32MB) PLB interface modules 64 32 Processor Local Bus ADDR ENOUT DECIN 32 DDR SDRAM Controller Error Detection Correction 64 64 Demux CLK 8 4 CLKn DECOUT PARITY_IN 2 ERROR D/MAPLD 2004 Xilinx XAPP645

  14. Self Configuration Checker Digital Design ICAP Controller top.bit Implementation ICAP top.ll (contains frame address used for the design) Read Back Commands ( 44 Bytes) Frame Address Memory (BRAMS) C script 4 Bytes (BRAMS) Frame address data formatted for BRAMS CRC Checker Virtex-II Pro This portion can be ported to a radiation-hardened FPGA in the case of robust strategy D/MAPLD 2004

  15. Self Configuration CheckerDesign Highlights • No External I/Os access required • Frame-by-frame read back required • 32-bit CRC algorithm implemented. (A CRC signature is generated after device power up) • No SRL16 and Distributed SelectRAMs used in design D/MAPLD 2004

  16. Labview Fault Injection Panel Screenshot of fault injection emulator that interfaces with the prototype board. Process Bus Fault Injection Buttons Program counter resets to zero when a CPU reset occurs. ASCII Command Input window Fault Injection Error Counters Processors Mismatch LED Indicator Fault location map D/MAPLD 2004

  17. XC2VP20 Device Utilization (without TMR) Number of External IOBs 57 out of 564 10%  Number of PPC405s 2 out of 2 100% Number of RAMB16s 30 out of 88 34% Number of SLICEs 4334 out of 9280 46% Number of BUFGMUXs 6 out of 16 37% Number of DCMs 2 out of 8 25% Number of ICAPs 1 out of 1 100% Number of JTAGPPCs 1 out of 1 100% D/MAPLD 2004

  18. Slice Utilization (without TMR) D/MAPLD 2004 Note: The shaded modules can be replaced by other approach.

  19. Mitigation State Machine CPU Interrupt 1) OCM BRAM single-bit error 2) PLB BRAM single-bit error 3) DDR SDRAM single-bit error CPU Reset 1) CPU mismatch 2) CPU watchdog timer 3) OCM EDC double-bit error CPU reset counter == full Normal Mitigation Severity System Reset 1) OPB Bus error 2) PLB Bus error System reset counter == full FPGA Reconfiguration 1) Configuration check fail 2) PLB EDC double-bit error 3) DDR SDRAM double-bit error D/MAPLD 2004

  20. Conclusion • Identified and categorized error prone regions on the Virtex-II Pro into four types • Developed mitigation strategies for each region. • Radiation test on the overall system is in progress. D/MAPLD 2004

  21. Acronyms • SEU : Single Event Upset • FPGA: Field Programmable Gate Array • LUT: Look Up Table • PLB: Processor Local Bus • OPB: On-Chip Peripheral Bus • OCM: On-Chip Memory • EDAC: Error Detect-And-Correct • ICAP: Internal Configuration Access Point D/MAPLD 2004

More Related