Statistical Methods for "Zero Defect” Quality in Nanometer Integrated Circuits

Statistical Methods for "Zero Defect” Quality in Nanometer Integrated Circuits Adit D. Singh Electrical & Computer Engineering Auburn University IEEE EWDTS2009 Moscow Sept 20 2009

Outline • DPM requirements for ICs – why the push to “Zero Defects” • Perfect testing of all parts for zero DPM is impossible/cost prohibitive • Statistical Methods and Adaptive tests: - Test each part differently only to the extent it needs to be tested - Only “suspect” parts are extensively tested • How suspects can be identified • Results and Conclusions

The IC Quality ChallengeApproximate order-of-magnitude estimates • Number of parts per typical system: 100 • Acceptable system defect rate: 1% (1 per 100) • Therefore, required part reliability 1 defect in 10,000 100 Defects Per Million (100 DPM) Typical Commercial Requirement ~100 DPM ~500 DPM for ASICs

The IC Quality Challenge New Automotive industry target : “zero” defects! Why? • Electronics contributing to excessive warrantee repair costs • Offending parts can be identified through full part traceability over 4-5 year warrantee period -High unit volumes provide robust statistical data • US annual sales ~ 15 million

Required Test QualityAssume 2 million ICs manufactured with 50% yield • 1 million GOOD >> shipped • 1 million BAD >> test escapes cause defective parts to be shipped • For 100 BAD parts in 1M shipped (DPM=100) Test must detect 999,900 out of the 1,000,000 BAD For 100 DPM: Required Test Coverage = 99.99%

So How Do we Target Zero DPM? Test better, longer and harder OR Learn from Airport Security Screening and Test smarter! After 9-11 requirement for Airport Security Screening immediately went to “Zero” Defects First response: Test every passenger perfectly

New Airport Security Check-in Procedures

How to Target “Zero” Defects Smart Tests Exhaustive testing every airline passenger is cost prohibitive - Cannot strip search everyone Solution: Identify a relatively few “suspect” parts to test intensively (or discard) “Suspects” Identified Based on: Profiling: “company you keep” Outliers: “something doesn’t look right”

Extreme cases! What happens to Suspects? More generally Suspects are discarded - bumped off the flight: no fly list Suspects are tested further Test Optimization - saves high cost testing of exhaustive testing of every passenger Adaptive testing - Appropriate tests are applied depending on “what looks different”

Targeting Zero DPM in ICs Cannot perfectly test all parts –too expensive Solution: Intensively test “suspect” chips or even discard without conclusive evidence of fault “Suspects” Identified Based on: Profiling: “company you keep” - Die from bad neighborhoods Outliers: “something doesn’t look right” - Die that behave differently in some way

Understanding Screening Methods Targeting Zero Defect IC Quality Profiling: “company you keep” - Statistical defect clustering in manufactured lots and wafers – Bad neighborhoods Outliers: “something doesn’t look right” - Abnormal, although within specification, test responses

DPM Depends on incoming YieldTest Coverage: 99.99% (Escapes 100 per million defective) • 1 Million Parts @ 10% Yield 0.1 million GOOD >> shipped 0.9 million BAD >> 90 test escapes DPM = 90 /0.1 = 900 • 1 Million Parts @ 90% Yield 0.9 million GOOD >> shipped 0.1 million BAD >> 10 test escapes DPM = 10/0.9 = 11

Exploiting Spatial Defect Statistics for Test • If we can bin die from wafers into bins with different yield: • We can get bins with lower DPM for the same test • Optimize further testing for each bin • Binning is possible because defects on wafers tend to cluster DPM=11 90% DPM=900 10%

Manufacturing Defects and Die Yield • Two classes of Manufacturing Defects • Gross or area defects • Random Spot Defects • In mature well controlled processes, die yield is mostly limited by random spot defects

Poisson Defect Statistics • The simplest defect distribution • model for semiconductor • wafers assumes that • random spot defects are • uniformly distributed • Die Yield x x x x x x x x x x x Where is the average number of defects per die

Defect Clustering on Wafers • The Poisson model has been found to consistently underestimate yield • This suggests, defects on semiconductor wafers are not uniformly distributed but are clustered x x x x x x x x x x • For a given total number of defects on the wafer, defect clustering results in more die with multiple defects, and therefore more defect free die (higher yield)

Defect Clustering on Wafers • Defect clustering has been observed in virtually every fabrication line in 40 years of semiconductor manufacturing experience • The causes of defect clustering are numerous and varied, and can be related to many different fabrication steps. • The extent of defect clustering can vary based on the product and process technology

Binning Good Die Based on Neighbors 0 Bad Neighbors 1 Bad Neighbor 2 Bad Neighbors . . . 8 Bad Neighbors X X X X 4 4 0 X X X X 1 2 X X 4 X X X • Best Bin has dice with highest “a priori” yield • => Lowest defect levels from test escapes

Burn-In Fail Probability for 77,000 Chips - Barnett, Singh VTS 2002 Fail Probability (0+) 1 2 3 4 5 6 7 8 Bin 1 => Bin 8

Burn-In Fail Probability for 77,000 Chips - Barnett, Singh VTS 2002 Fail Probability Burn-in 1(+0) 2 3 4 5 6 7 8 Bin 1 => Bin 8 (Neighboring Bad Die)

Binning for Low Defect Levels • Best bin defect levels (DPM) are typically 3-7X better than the lot average • Greatest benefit for high clustering and low yields • For extensively tested die needed for ultra reliable applications - virtually impossible to achieve DPM improvements from additional testing • Further testing can focus on the worse bins to optimize test costs – Adaptive Test

Adaptive Test and DPM Optimization by Binning based on Local Yield • Binning die based on Local Region Yield gives die with various degrees of DPM and reliability • Further testing can focus on bins with die containing repaired defects and/or from defective neighborhoods for test optimization • Methodology covered by US Patents 7409306 and 7194366 • “System and Method for Estimating Reliability of Components for • Testing and Quality Optimization” - Inventors: Barnett and Singh

Identifying “Suspects” from Abnormal Response as compared to Matched Parts • Something doesn’t “look right” on some analog measurement –e.g. IDDQ, speed etc. • Part still tests within functional specifications

Identifying “Suspects” from Abnormal Response from Matched Parts Key Idea: • Analog IC performance measures should be similar for matched parts • Any anomalies, even if within functional specifications, indicate a defect which could be a test escape and fail in the field, or result in a reliability problem

Eaample: minVDD Timing Tests - Madge et al VTS 2003 • minVDD Testing finds the lowest VDD for which the circuit passes a delay fault (TDF) test for a given clock speed • An abnormal minVDD value with respect to the expected value for matched parts indicates a defect that may be a test escape or reliability failure

MinVDD vs Device SpeedTwo different lots showing min VDD outliers and lot-to-lot intrinsic variation. Ring Oscillator Frequency

minVDD Testing Minimum VDD results for different functional tests clearly showing min VDD outliers (circled)

Conclusion • Faulty die are more likely to be found near other faulty die. • Die that test good yet are near many faulty die are greater reliability risks: suspect parts • Suspect parts can also be indentified as outliers from analog test measurements • Adaptive tests focus on suspect parts to minimize test escapes at affordable costs

Questions?

Statistical Methods for "Zero Defect” Quality in Nanometer Integrated Circuits