Anthony J. Yu August 15, 2005

Defect Tolerancefor Yield Enhancementof FPGA InterconnectUsing Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005

Outline • Introduction and motivation • Previous works • New architectures • Coarse-grain redundancy (CGR) • Fine-grain redundancy (FGR) • Comparisons • Conclusions

Introduction and Motivation • Scaling introduces new types of defects • Number of defects expected to increase as chip density increases • As a result, chip yield is on the decline • FPGAs are mostly interconnect • To improve yield (and revenue), we must tolerate multiple interconnect defects

General Defect Tolerant Techniques • Defect-tolerant techniques minimize impact (cost) of manufacturing defects • FPGA defect-tolerance can be loosely categorized into three classes: • Software Redundancy – use CAD tools to map around the defects • Hardware Redundancy – incorporate spare resources to assist in defect correction (eg. Spare row/column) • Run-time Redundancy – protection against transient faults such as SEUs (eg. TMR)

Previous work – 1 – Xilinx • Xilinx’s Defect-Tolerant Approach • Customer (knowingly) purchases “less that perfect” parts • Customer gives Xilinx configuration bitstream • Xilinx tests FPGA devices against bitstream • Sells FPGA parts that “appear” perfect • Defects avoid the bitstream • Limitation: • Chips work only with given bitstream – no changes!

Previous work – 2 – Altera • Altera’s Defect-Tolerant Approach • Customer purchases “seemingly perfect” parts • Make defective resources inaccessible to user • Coarse-grain architecture • Spare row and column in array (like memories) • Defective row/column must be bypassed • Use the spare row/column instead • Limitation: • Does not scale well (multiple defects)

Objectives • Problem • FPGA yield is on decline because of aggressive technology scaling • Important objectives to improve yield: • Tolerate interconnect defects (dominates area) • Tolerate multiple defects (future trend) • Preserve timing (no timing re-verification) • Fast correction time (production use)

Contributions • New fine-grain redundancy architecture • Coarse-grain architecture with multiple spare rows and columns • Detailed evaluation of fine-grain and coarse-grain redundancy • Area, delay, yield estimates • Publications: • Non-redundant architecture paper, at FPT’04 • Fine-grain architecture paper, to appear in FPL’05 • Yield comparison paper, to appear in FPT’05

Non-redundant Interconnect Switch HIGH-LEVELMODEL OLD (bidirectional) MODERN (directional)

Coarse-grain Redundancy (CGR)

So…what’s wrong with it?

Improving yield for CGR –Adding Multiple Global Spares • Add multiple global spare to traditional CGR • Global spares can be used to repair any defective row/column in the array • Wire extensions are now longer

Yield Impact of Multiple Global Spares

Increasing Area+Delay Overhead MORE SPARES  MORE MUX OVERHEAD IN EVERY SWITCH ELEMENT NO SPARES 2 GLOBAL SPARES 4 GLOBAL SPARES MAY BE IMPRACTICAL !!! 1 GLOBAL SPARE

Fine-grain Redundancy (FGR) – Avoidance by Shifting

Implementation Overview

FGR Switch Element Details Defect Downstream Switch Block Upstream Switch Block

FGR Implementation Comparison

FGR Architectural Summary • Several implementations of FGR evaluated: • Implementation with best yield improvement (EM22) • Area +50%, delay + 20% • Implementation with lowest yield improvement (EN11) • Area +35%, delay +25% • Perfect chips can be sold as interconnect-enhanced FPGAs • Allow router to use spare routing resources (muxes, tracks) • Gives more routing flexibility • True area and delay overhead are 10-20% and 5-25%

Comparison between FGR and CGR – FGR Tolerates Tens of Defects

Estimated Area overhead at equal yield (80%) * CGR-G1 can only tolerate 1-2 defects

Limitations of Study & Architectures • FGR • Does not tolerate defects in the logic • Cannot tolerate clustered defects • Requires a detailed fault map • CGR • Assumes that all defects can be corrected with a single row/column • Bypass circuitry is approximated

Conclusions • CGR is effective for 1 or 2 defects • FGR meets desired objectives: • Tolerates multiple randomly distributed defects • Defect correction does not perturb timing • Tolerates an increasing number of defects as array size increases • Correction can be applied quickly • FGR potentially capable of correcting crosstalk faults, but is not explored in thesis

Contributions • New fine-grain redundancy architecture • Coarse-grain architecture with multiple spare rows and columns • Detailed evaluation of fine-grain and coarse-grain redundancy • Detailed circuit-level design  improved area, delay estimates • Yield comparison • Publications: • Non-redundant architecture paper, at FPT’04 • Fine-grain architecture paper, to appear in FPL’05 • Yield comparison paper, to appear in FPT’05

Thank you! anthonyy@ece.ubc.ca

Improving yield for CGR –Adding Multiple Local Spares • Divide FPGA into subdivisions • Each subdivision has localspare(s) • Distributes spares across chip • Reduces mux area overhead(of Global scheme) • Limitation: • Spare(s) can only repair defect within the subdivision

Yield Impact of Multiple Local Spares(not as good as Global with same # spares)

Summary • As the density of FPGAs increase, they become increasingly susceptible to manufacturing defects • Defect-tolerant techniques alleviate this growing problem • Depending on the desired level of protection, we can apply different techniques • At low defect rates, the coarse-grain spare row and column approach has lower overhead than the fine-grain approach • At the same area overhead, the fine-grain approach can tolerate more defects than the spare row and column approach

Anthony J. Yu August 15, 2005

Anthony J. Yu August 15, 2005

Presentation Transcript

Anthony J Petrella, PhD

31 AUGUST 2005

18 August 2005

Captain Anthony J Barrett-Jolley

August 25, 2005

Anthony J. Yu Guy G.F. Lemieux August 25, 2005

Anthony J. Yu Guy G.F. Lemieux September 15, 2005

August 2005

August 2005

Price and Volume August 1 - October 15, 2005

AE Senior Thesis 2005-06 Anthony J. Lucostic Construction Management

August 29 – September 15, 2005

August 15

August 2005

15 August 2005

Paris, August 2005

17 August 2005