280 likes | 453 Views
Platform ASICs Reliability. Bob Madge Miguel Vilchis LSI Logic , Milpitas, CA Vish Bhide. Three Aspects of Reliability. Infant Mortality Failures (Latent Defects and active defects that are not screenable with Testing) Environmental induced Failures (SER etc..)
E N D
Platform ASICs Reliability Bob Madge Miguel Vilchis LSI Logic , Milpitas, CA Vish Bhide
Three Aspects of Reliability • Infant Mortality Failures (Latent Defects and active defects that are not screenable with Testing) • Environmental induced Failures (SER etc..) • Intrinsic Failures (Stress migration , electromigration , Wear-out , performance degradation)
Methods for Optimizing Reliability • Design For Reliability • Design for Defect Tolerance • Design for SER tolerance • Design for Stress Migration Tolerance • Redundancy • Process Improvements • Defect Density Reduction • Excursion Control • Test Improvements • Resistive Defect Fault Coverage • Un-modeled fault coverage • Statistical Testing
Platform ASIC –Infrastructure for Reliability Optimized Regular Logic Pre-Verified IP Optimized Memory Embedded Test and Repair Process Monitor and Die Traceability Configurable I/O
Platform ASIC Design For Reliability • Allowable antenna ratios have a 3x margin • Additional protection against plasma induced damage • Protection against wearout mechanisms • Stress Migration • Electro Migration • Hot Carrier Injection • Time Dependent Dielectric Breakdown • Vt stability • The rules are considered conservative within the industry • The rules are tested for fabrication capability, yield and reliability during qualification • The libraries and layout tools strictly follow the design rules • Cells with intentional design rule violations require qualification • Tools such as relmil, BERT are available to the designers
Continuous feedback and improvement • During Development: • Library elements are redesigned during the development cycle if design rules change • Process/Tool changes are made if the libraries cannot change • During production: • The customer issues are continuously fed back • Process changes are made if applicable • Cell changes are made if necessary • Current customers are protected by the MRB/TRB systems • Redesigns incorporate the changes if necessary • Future customers only receive the revised version • Identification of issues by one customer benefits all others
Single Event Effects Mitigation • LSI Logic platform ASICs using 0.18 um CMOS commercial process are designedusing best practices for SEE mitigation • optimization of n-well spacing and profile • buried layer for latchup mitigation • The mitigation strategies lead to SEL immunity to atmospheric neutron fluence of 5E10 n/cm2 and SEU cross section of 4E-14 cm2/bit. This performance is superior to a typical commercial grade product. • On chip error correction codes (ECC) for memory devices are available • word interleaving for multi bit error mitigation
Radiation Performance • Unlike commercial grade products, Platform ASICs using a 0.18 um CMOS modified process have been characterized for Total Ionizing Dose (TID), • functional up to the maximum dose level of 300 krad-Si • no noticeable degradation up to a total dose of 90 krad-Si. • Unlike commercial grade products, Platform ASICs using a 0.18 um CMOS modified process have been characterized for heavy ion Single Event Effects (SEE) • At LET of 75 MeV cm2/mg • logic and SRAM blocks are immune to single event latchup (SEL) • Single Event Upset saturation cross sections are 1E-06 cm2/bit and 7E-07 cm2/flip-flop • Additionally, Platform ASICs using a 0.115 um CMOS modified process have been characterized for heavy ion Single Event Effects (SEE) • At LET of 108 MeV cm2/mg • logic and SRAM blocks are immune to single event latchup (SEL) • Single Event Upset cross sections data is under evaluation
Efficient Netlist Implementation • Early and Intrinsic Failure rates and Soft Error Rates are proportional to the number of used gates and SRAM • The platform ASICs implement the the RTL with minimum overhead of logic gates and require no configuration SRAM • Efficient Implementation through mask configuration leads to lower product failure rate than a product that requires more gates to implement the same functionality
Maverick Silicon Screening Procedures • Lot Level • Lot Yield limits,Lot Acceptance testing • >>>Minimum Quality • Wafer Level • Wafer Yield limits,Statistical Bin Limits • Maverick Lot Control • >>>>Medium Quality • Die Level • Iddq , VDD and Fmax Outlier Screening • Dynamic and Enhanced Voltage Stress testing • Adaptive Thresholds and limits • Neighborhood Association or location Exclusion • >>>>Maximum quality
Targeted Defect Coverage • Reported Fault Coverage • Tool reported Scan (target 99%) , Iddq , Memory , Delay fault coverage. • Weighted Fault Coverage (Test Coverage) • Reported Fault Coverage weighted by Area of the chip. • Defect Coverage • Is a factor of : Weighted Fault Coverage , Fab. Defectivity Frequency , Gate Count and Outlier Screening Efficiency. • Drives EFR and DPM • Target 99.9%
Wafer Test with full data collection and inkless wafermaps LSI Logic Statistical Post-Processing™ Test Flow Data Wafers Post-processor #1 : Delta IDDq / MinVDD Inkless Assembly Post-processor #2 : Neighborhood Residual (NNR) and Exclusion (NAE) Final package test Post-processor #3 : Independent Component Analysis (ICA) (Final Test Post-Processing) Post-Processor #4: Temperature Ratios Burn-in (special bins only) SHIP Modify Inkless Maps (upgrade or downgrade into special bins)
Rejection candidates Burn-in candidates SPP: Value Application:Burn-in elimination or reduction
Count Current Speed Signal Overlap in Current Vs Speed
Statistical Post-Processing™ (SPP) used to Screen Test Outliers. • SPP Concept # 1: Delta Iddq/MinVDD • The delta between the Iddq or MinVDD of discrete tests or vectors within a device can be used to distinguish between defective die and defect-free die. • SPP Concept # 2: Nearest Neighbor Residual (NNR) • If Iddq/MinVDD for a given die location on a wafer is significantly higher than it’s neighbors, that die can be considered defective. • SPP Concept #3 : Independent Component Analysis (ICA) • Identification and elimination of independent sources of parametric variation for defect identification. • SPP Concept #4 : Temperature Ratio Testing(US Patent Issue No . 6,532,431) • Ratio of test parameters (Iddq, MinVDD, Fmax) at two different temperatures can be used to distinguish between defective and defect free die.
NNR Outliers SPP to Screen MinVDD Outliers : “Nearest Neighbor Analysis (NNR)”
Count SPP Limit SPP Limit Intrinsic Estimate Defect Signal Resolution after Post-Processing Residual
Resistive unfilled W Via Resistive Defect Outlier Screening
SPP Screening Temperature Dependant Defects The two RMA parts are Feed-Forward MinVDD Outliers
Good Die in Bad Neighborhoods – Latent Defect Post-Processing Downgrades In production on G12 at increasing “stringency”
Location Based Downgrades Downgrade “good die” in “at-risk” locations like the edge
Defect Density Quality Time Time SPP: Application:New Technology Defectivity Control Quality Ramp Guarantee Defect Density SPP Stringency Quality Time Time Time Pre-emptive Variable Threshold control function
Defect Density Defect Density Time Time SPP: Application:Process Fluctuations and Maverick Lot Control Quality Time Feed Forward Quality Ramp Guarantee SPP Stringency Quality Time Time Feed-forward Variable Threshold control function Pre-emptive Variable Threshold control function
Overall EFR without SPP Overall EFR with SPP Node 3 EFR Node 2 EFR Node 1 EFR Node 3 DD Node 2 DD Node 1 DD 0 3 6 9 12 15 18 21 24 27 30 33 36 39 Quality improvement due to Statistical Post-Processing™ EFR/DPM DD Months
Overall data show at least 56% effectiveness. • The EFR estimate for Bin1 population is data limited and likely to be at least 50% lower • Therefore the overall effectiveness number is likely to be much higher.