160 likes | 178 Views
Enhance memory reliability using a combined ECC and redundancy repair scheme, increasing MTTF and reducing impact of defects.
E N D
An Integrated ECC and Redundancy Repair Scheme for Memory Reliability Enhancement Chin-Lung Su, Yi-Ting Yeh, and Cheng-Wen Wu National Tsing Hua University Hsinchu, Taiwan
Introduction • Memory cores are widely used in SOC designs • They have higher density and occupy larger area • Dominate the chip yield • Their use is increasing in nano-technologies according to ITRS • Reliability is also an important issue for memory • ECC and redundancy repair are both widely used fault tolerance techniques • After production test, there may be some un-used redundancy • Combine ECC and un-used redundancy • Higher yield and greater degree of fault tolerance IC-DFN/08-06/cww
Chip Area Breakdown Source: International Technology Roadmap for Semiconductors (ITRS), 2001-2005 IC-DFN/08-06/cww
RAM Controller Pattern Generator Test Collar (MUX) Comparator Go/No-Go BIST Module RAM Controller Typical RAM BIST Architecture Microprogram Hardwired CPU core IEEE 1149.1 Counter LUT LFSR IC-DFN/08-06/cww
Sharing Controller & Sequencer IC-DFN/08-06/cww
Single Double 16 16 Corrector Decoder RAM 6 Syndrome Syndrome Gen 16 16 16 6 6 6 Cb Gen Cb Gen Data Bus Typical RAM ECC Architecture Mainly for improving reliability IC-DFN/08-06/cww
Reconfiguration Mechanism Redundancy Analyzer RAM MUX Spare Elements BIST I/O RAM Built-In Self-Repair (BISR) Mainly for improving yield IC-DFN/08-06/cww
A Power-On BISR Scheme Q D A Main Memory Wrapper MAO BIRA POR BIST Spare Memory MAO: mask address output; POR: power-on reset Source: ITC’03 IC-DFN/08-06/cww
Proposed Scheme • Integrated ECC and Redundancy Repair Scheme • Hard errors are repaired by physical redundancy in field • Soft-error correction ability is not harmed by hard errors • Enhance reliability • Assumptions • During Error Identification phase, no other faults may occur • Error rate << system clock speed IC-DFN/08-06/cww
Phases of Proposed Scheme IC-DFN/08-06/cww
Error Identification Phase • Write back process • Write the corrected data back to memory • Read data from the same address • Soft error may be eliminated with this process • Assume that no other errors may occur • After error identification • “Hard repair phase” for a hard error/fault • “Fault-free phase” for a soft error IC-DFN/08-06/cww
Hard Repair Phase • Repair this hard fault with spare • Map the faulty word to redundant word • Write the corrected data into redundant word • Hard fault location • In main memory: follow the above procedure • In redundant memory: mark the faulty redundant element • During this phase, memory cannot be accessed • Idle mode • Hard fault is removed after this phase • Reliability and MTTF is increased IC-DFN/08-06/cww
Experimental Results • Technology: TSMC 0.25um CMOS process • The redundant memory consists of eight spare rows and four spare columns IC-DFN/08-06/cww
Experimental Results (cont.) !ECC: Without ECC SEC: With SEC/DED ECC SECP: Proposed Scheme Area = Memory + ECC + BIST Cost = Area / MTTF IC-DFN/08-06/cww
Reliability Improvement 8K x64 memory r+c = 12 IC-DFN/08-06/cww
Conclusions • An integrated ECC and redundancy repair scheme is proposed • Enhancing memory reliability and MTTF • Low area overhead • Integrating ECC Controller with BIST • No timing penalty in normal operation • Cost-effective way for reducing the effect of parametric defects IC-DFN/08-06/cww