260 likes | 359 Views
Test Economics for Homogeneous Manycore Systems. Lin Huang † and Qiang Xu †‡ † CU hk RE liable computing laboratory (CURE) Department of Computer Science & Engineering The Chinese University of Hong Kong ‡ CAS-CUHK Shenzhen Institute of Advanced Integration Technology.
E N D
Test Economics for Homogeneous Manycore Systems Lin Huang† and Qiang Xu†‡ †CUhk REliable computing laboratory (CURE) Department of Computer Science & Engineering The Chinese University of Hong Kong ‡CAS-CUHK Shenzhen Institute of Advanced Integration Technology
Observations on Manufacturing Test Cost • Manufacturing test is responsible for achieving sufficient high defect coverage • As technology advances … • Test patterns that target more kinds of errors become essential • Accelerated testing methods (e.g., burn-in test) becomes difficult • Manufacturing test cost – a great share of production cost • In particular, burn-in cost can range from 5-40% of production cost • If we are able to relax the coverage requirement, manufacturing cost can be dramatically reduced
Manycore Processor Era Provides us An Opportunity • The integration of a large number of cores on a single silicon die • Increasingly popular in the industry • Traditional yield-driven redundant cores aims to improve the manufacturing yield • We propose to introduce a few test cost-driven redundant cores in addition to yield-driven spares for test cost reduction • If test cost reduction exceeds the manufacturing cost increment, the total production cost can be reduced
Manycore Processor Era Provides us An Opportunity • If test cost reduction exceeds the manufacturing cost increment, the total production cost can be reduced • Consider a 16-core processor To guarantee that all 16 cores work well provided they pass manufacturing test, we need … –Very high defect coverage to identify killer defects – Sufficient burn-in to weed out chips with latent defects Manufacturing test is responsible for 16 out of 20 cores (instead of all 20 cores) to work –Defect coverage requirement can be lowered –Burn-in test can be reduced or eliminated – Manufacturing cost increases
Agenda • Background • Test Economics with Partial/No Burn-In • Test Economics with Partial Manufacturing Test • Experimental Results • Conclusion
Basics in Yield Modeling • Defects on chip – Negative-binomial distribution • Defect type • Killer defects • Latent defects • Bathtub curve
Problem 1[Partial Burn-In] • Enable partial/no burn-in test only • Given defect coverage requirement, we consider to introduce redundant cores into manycore system that functions if no less than cores are defect-free • We fabricate cores on a chip • Chips with all cores pass test are sold out • Eventually we need to guarantee cores are defect-free at the end of infant morality • Determine the number of burn-in driven spares and burn-in time such that … • The production cost per sold chip is minimized • Product quality constraint is met
The Impact of Partial Burn-In • The reliability induced by latent defects follows Weibull distribution with decreasing failure rate • Assume that all latent defects reveal themselves after full burn-in time
Product Quality and Chip Test Yield • Product quality requirement • The probability that a sold chip actually functions at the end of infant mortality should be higher than a threshold • – no less than cores on a chip is defect-free at the end of infant mortality • – all cores on a chip pass manufacturing test after (partial) burn-in • Chip test yield
Product Quality and Chip Test Yield • Define • – -out-of- cores are initially defect-free • – cores in that set maintain defect-free after burn-in time • We obtain
Cost Model • Simple yet effective cost model – capture the key impact of introducing burn-in driven redundancy • Manufacturing cost – normalize to the case that manufacturing cost of each core for manycore chips without redundancy is 1 unit • ATE cost – ATE cost per fabricated core is unit • Burn-in cost – normalize the cost of fully burn-in process as unit and assume it is proportional to the burn-in time
Case Study on Partial/No Burn-In • Homogeneous manycore system that functions with no less than 32 defect-free cores • Product quality requirement is set to 500DPPM
Problem 2[Partial Burn-In & Relaxed Defect Coverage] • Not only enable partial/no burn-in test but also relax the defect coverage for core tests • We introduce test cost-driven spares and yield-driven ones • We have totally identical cores on chip • Chips containing no less than pass- test cores are shipped out • Eventually we need to guarantee cores are defect-free at the end of infant morality
Problem 2[Partial Burn-In & Relaxed Defect Coverage] • Determine the number of test cost-driven spares, number of yield-driven spares, defect coverage for core test, and burn-in time such that … • The production cost per sold chip is minimized • Product quality constraint is met
The Impact of Test Decision Criterion • Ideally a prefect manufacturing test is able to reject all bad cores while accept all defect-free ones • and • In reality … Test escapes False rejects
Product Quality with False Rejects • Redefine • – no less than cores on a chip is defect-free at the end of infant mortality • – no less than cores among all cores on a chip pass manufacturing test after (partial) burn-in • Similarly, we have
Product Quality with False Rejects • Notations • – -out-of- cores are initially defect-free • – cores in that set maintain defect-free after burn-in time • – among good cores on a chip, pass the test • – among bad cores, pass the test • We have
Cost Model • Total production cost • ATE cost depends on defect coverage
Experimental Setup • Homogeneous manycore system that functions with no less than 32 defect-free cores (i.e., ) • The best , and combination in terms of production cost is determined by exploring solution space • System parameters • , , , , • , , • Product quality requirement is set to 500DPPM
High defect density Low defect density Tradeoff between Burn-In Cost and ATE Cost under Product Quality Constraint
Conclusion • We propose to introduce spare cores into manycore system • Burn-in test time can be shorten • Defect coverage requirement can be relaxed • Without sacrificing quality of the shipped products • We develop novel analytical models to verify the effectiveness of the proposed strategy
Test Economics for Homogeneous Manycore Systems Thank you for your attention !