350 likes | 455 Views
IPR: In-Place Reconfiguration for FPGA Fault Tolerance. Zhe Feng 1 , Yu Hu 1 , Lei He 1 and Rupak Majumdar 2 1 Electrical Engineering Department 2 Computer Science Department University of California, Los Angeles Present by Zhe Feng. Address comments to lhe@ee.ucla.edu. Outline.
E N D
IPR: In-Place Reconfiguration for FPGA Fault Tolerance Zhe Feng1, Yu Hu1, Lei He1 and Rupak Majumdar2 1Electrical Engineering Department 2Computer Science Department University of California, Los Angeles Present by ZheFeng Address comments tolhe@ee.ucla.edu
Outline Introduction and motivation Algorithms Experimental Results Conclusions
Soft Error • Soft errors could be caused by cosmic rays or noise upsets • Future devices more vulnerable due to scaling • Special session 1E “Resilient Computing” • Two types of soft errors in FPGA • Single Event Upset (SEU): Modification of the content of memory bits • Single Event Transient (SET): Glitches latched by registers
SEU for FPGA SEU of block memory can be detected and corrected by row-based CRC and ECC SEU of configuration memory can be fixed by Periodical memory scrubbing. Scan-based CRC and ECC Both may be too late, as the circuit function may have been changed.
SER (Soft Error Rate) • SER is calculated by Monte Carlo simulation under single fault model. • In each run, SER is the percentage of clock cycles with observable errors at primary output for given test bench • The overall SER is the average of all runs. • SER 1/ MTTF (mean time to failures)
Impact of SEU for FPGA • FGPA has 10x bigger SER compared to ASIC • Due to large configuration memory • SEU is one of biggest challenges for FPGA-based applications • Most FPGAs are used in systems but not prototypes • One of the biggest application is internet routers • FPGA boards returned after two crashes
FPGA Resynthesis RTL Synthesis Logic Synthesis Technology Mapping Resynthesis Packing P&R (Source: Andrew Ling, University of Toronto, DAC'05) • Resynthesis • Rewrites the circuit in logic or physical netlist • Reconfigures the LUTs
ROSE performs iterative logic transformations with explicit stochastic yield rate evaluation Logic transformation by fault tolerance Boolean Matching Boolean Matching Inputs Template H and Boolean function F for logic block Fault rates for the inputs and the SRAM bits of the template Outputs Either that F cannot be implemented by template H Or the configuration of H to obtain function F ROSE: RObust REsynthesis [ICCAD08’] • Fault-Tolerant Boolean Matching minimizes the observable faults at the output of the template
Need of In-place Logic Optimization ROSE, same as most existing logic optimization techniques, does not preserve the layout (topology) of a circuit design. Interconnect dominates in FPGA In-place resynthesis (IPR) leads to a faster design closure. Minimal or no impact on the physical design ROSE IPR
Our Major Contributions • Propose an in-place resynthesis algorithm, IPR • Maximize the yield rate for FPGAs • Preserve the topology of the logic network • Reduce the runtime complexity compared to other SAT-based approaches • IPR reduces the fault rate by 48% and increases MTTF by 1.94X. • Compared to the state-of-the-art academic technology mapper Berkeley ABC. • With the same area and performance.
Outline Background Algorithms Experimental Results Conclusions
(0 -> 1) (0 -> 1) IPR: In-place Reconfiguration 1 1 1 0 0 1 1 1 0 1 0 Fault rate = 37.5% 1 1 0 0 0 1 0 1 0 0 0 0 1 1 0 0 1 0 0 1 1 0 0 1 1 1 0 1 0 1 Fault rate = 12.5% 1 0 1 1 1 0 1 1 1 0 1 1 0 0 1 0 0 • Maximize identical configuration bits for complementary inputs of an LUT. • Change the functions of multiple LUTs to guarantee the function of the circuit unchanged.
IPR algorithm Circuit Analysis Initial Full-chip Functional Simulation Initial Full-chip ODC Mask Calculation Node Criticality Analysis Cone Construction In-place LUT Reconfiguration and Boolean Matching Localize Truth Table Update Localize ODC Mask Update Localize Update
IPR algorithm Circuit Analysis Initial Full-chip Functional Simulation Initial Full-chip ODC Mask Calculation Node Criticality Analysis Cone Construction In-place LUT Reconfiguration and Boolean Matching Localize Truth Table Update Localize ODC Mask Update Localize Update
ODC Mask based Node Criticality Primary outputs Logic Network 1 0 1 0 1 0 0 0 0 1 … 1 0 0 1 1 0 0 1 1 0 0 1 0 ODC mask: 1010 (I. Markov, ICCAD’07) 1 0 0 0 0 • The ODC mask quantifies the impact of a node on the primary outputs. • The criticality of a node is defined as the percentage of one’s in the ODC mask, and decides the priority of reconfiguration in IPR.
IPR algorithm Circuit Analysis Initial Full-chip Functional Simulation Initial Full-chip ODC Mask Calculation Node Criticality Analysis Cone Construction In-place LUT Reconfiguration and Boolean Matching Localize Truth Table Update Localize ODC Mask Update Localize Update
Cone Construction • Select a subset SN of first-order fanout LUTs of n • Construct a cone for a selected root LUT • Root LUT is a fanout of SN • Include SN but not its first-order fanins • Cut size of the cone is limited a b n c d e Root
In-place LUT Reconfiguration • The functions of LUTs in the cone are changed to increase # of identical configuration pairs • But function of input/out nets and topology of internal nets are kept unchanged • No change of circuit function and layout a b n c d e Root
In-place Boolean Matching Conjunctive Normal Form (CNF) Truth table can be encoded as follows The cone can be encoded as follows To make a pair of configuration bits (ci, cj) in LUT L symmetric, we have Combining all the three, we have CNF formulation for in-place Boolean matching (IP-BM). IP-BM preserves both the logic function and topology of the cone.
Outline Background Algorithms Experimental Results Conclusions
Experimental Settings and CAD Flows • Implemented in C++ and use miniSAT2.0 as the SAT solver • Results collected on a Ubuntu workstation with 2.6GHz Xeon CPU and 2GB memory • QUIP benchmarks are tested • Mapped with 4-LUTs by Berkeley ABC • Perform and compare the following synthesis flows: ABC, IPR, ROSE+IPR
Experimental Settings and CAD Flows (Cont’) • Fault model • Uniform soft error rate for all configuration bits in LUT but ignore interconnect configuration bits during IPR. • Uniform soft error rate for all configuration bits in LUT and interconnect during validation. • The fault rate of the chip is calculated by Monte Carlo simulation • Single fault injection for all configuration bits in LUT and interconnect • 32k random inputs
Full-chip Fault Rate by Monte Carlo Simulation 59% fault rate reduction! ABC vs. IPR vs. ROSE+IPR: 1:0.52:0.51
Area (LUT#) ABC vs. IPR vs. ROSE+IPR: 1: 1 : 0.81
Estimation of Mean Time To Failure • The best flow in terms of the robustness and area is ROSE+IPR 50x faster!
Conclusions • We develop an in-place resynthesis algorithm, IPR. • Increases MTTF by 2X over ABC; • Preserves the topology of the logic network for a faster design closure; • Complementary to existing fault-tolerant resynthesis algorithms. • In the future, we will consider • Experiments assume multiple uncorrelated faults and given correlations between faults; • Extend IPR with criticality considering interconnects explicitly.
Thank You! IPR: In-Place Reconfiguration for FPGA Fault Tolerance Zhe Feng, Yu Hu, Lei He and Rupak Majumdar
Criticality for Configuration Bit Depends on two criteria: One is a sequence of input vectors for the LUT. The other is the ODC mask of the LUT. The criticality of a configuration bit c :
In-place Boolean Matching Conjunctive Normal Form (CNF) Truth table can be encoded as follows The cone can be encoded as follows To make a pair of configuration bits (ci, cj) in LUT L symmetric, we have Combining all the three, we have CNF formulation for in-place Boolean matching (IP-BM). IP-BM preserves both the logic function and topology of the cone.
IPR algorithm Circuit Analysis Initial Full-chip Functional Simulation Initial Full-chip ODC Mask Calculation Node Criticality Analysis Cone Construction In-place LUT Reconfiguration and Boolean Matching Localize Truth Table Update Localize ODC Mask Update Localize Update
Localized Update Localized update of ODC mask reduces runtime CMFI is affected, but the ODC mask is not updated to save time. ODC mask updated for CR. CMFOis not affected, so the ODC mask does not need to be updated. Maximum Maximum Reconfigured Fanin Cone Fanout Cone Cone C R C C MFI MFO
defect Key to stochastic synthesis: Logic Masking • Defects are created equally but not propagated equally • Logic don’t-cares may mask the propagation of defects Not affected by defects! Observability Don’t-cares with a=1&b=1 1 1 • We can maximize don’t-cares while keeps the logic function.
IPR Enhancement • Iterative (i.e., random) algorithm without greedy procedure based on criticality • Provide different ordering for optimization of gates • Without periodic yield rate evaluation • With periodic yield rate evaluation • Large cut size • Increase the opportunity to find the feasible cone.
IPR Enhancement (Cont’) Extend to MIMO MISO MIMO Increase the opportunity to try more LUTs