220 likes | 584 Views
Radiation Effects Challenges in 90nm Commercial-Density SRAMs: A Comprehensive SEE and TID Study. Jeff Draper , Y. Boulghassoul, M. Bajura, R. Naseer, J. Sondeen and S. Stansberry University of Southern California Information Sciences Institute.
E N D
Radiation Effects Challenges in 90nm Commercial-Density SRAMs: A Comprehensive SEE and TID Study Jeff Draper, Y. Boulghassoul, M. Bajura, R. Naseer, J. Sondeen and S. StansberryUniversity of Southern CaliforniaInformation Sciences Institute 1st Workshop on Fault-Tolerant Spaceborne Computing Employing New TechnologiesCSRI, Sandia National Labs, Albuquerque, NMMay 28-30, 2008 This work was supported by the Defense Advanced Research Projects Agency (DARPA) Microsystems Technology Office under award No. N66001-04-1-8914Any opinions, findings, and conclusions or recommendations expressed in this presentation are those of the authorsand do not necessarily reflect the views of DARPA/MTO or the U.S. Government
Motivations • RHBD approach shown to be effective for 90nm designs, within acceptable “1 process generation” penalty • Use of RHBD for SRAMs poses bigger challenges • SRAM density achieved through aggressive design rule waivers • Cell-level radiation hardening using typical RHBD techniques compounds area/speed/power penalties • Traditional circuit-based RHBD approach • Hardens control structures and individual memory cells • SRAM BER largely determined by the raw BER of the memory cell • Objective: Investigate best rad-hard SRAM performance achievable through hybrid hardening approach • Harden control structures but leave commercial SRAM cell density and technological scaling of individual memory cells intact • Mitigate SEUs (SBU/MBU) with device-centric ECCs • Leverage intrinsic process hardness for improved reliability
Outline • SRAM test chips overview • SEU response • Heavy-Ions • Protons • Latchup response • TID and temperature annealing • 24C, 100C and 150C • Summary and conclusions
Overview SRAM Test Chips • Fabricated 4 SRAMs in 9LP and 9SF processes (1 baseline, 1 hardened in each) • Key design objectives: Use commercial core memory cells (FP118 and E123) • Harden peripheral circuitry using TMR, annular gates, interleaving
RHBD SRAM Approach/Design Cell TID & SEL Hardening Guard rings (SEL) Block SET Hardening Bit Interleaving (MBU Mitigation) Voter Decoders, ECC, Timing #1 Decoders, ECC, Timing #2 Decoders, ECC, Timing #3 TMR Array SEU Hardening (SEC/DED) Annular gates (TID)
SEU Raw Cross Sections HI Test Results SF BASELINE LP LP SF HARDENED Memory Patterns={00,11,10}, Static/Dynamic = {s, d}; LP Dynamic Access Rate 2.2 KHz per bit; SF Dynamic Access Rate 718 Hz Data collected at LBNL 88” Cyclotron, 10 MeV cocktail, core voltage 10% below nominal, 100 MHz tester. Fluence range 1e7-1e5. #Errors>256 ea. pt.
SEU Cross Section CalculationsPre-ECC and Scrubbing s • Weibull(x) =[ a ]*[ 1-e{ ((x-x0)/w)) } ] • SF cross-section ~ 2-3 X higher than LP, likely due to lower Vdd • No cross-section dependence on static vs. dynamic testing • Minor differences between baseline and hardened suggest little impact of TMR control circuitry Calculations using CREME96, 100 mil shielding. AP8 model for equatorial orbit.
SEU ModelECC and Scrubbing * BER reduction vs. ECC and Scrub Rate P(error) per scrub vs. ECC and Scrub Rate 2 1 • P(error) depends on the ratio of the device’s memory array SCRUB RATE and its RAW BER, and ECC applied • Overall reduction in error-rate is relative to starting physical raw BER. • Example: Scrub rate=100, Physical BER=10-6, Single-bit ECC, improves BER by 10-4 – New Effective BER 10-10 • Goal: Assume once/10 seconds scrub rate and 10-10 BER; the device can tolerate up to 10-5 errors/bit-day with single-bit ECC, 10-2 errors/bit-day with double-bit ECC. Constant 1E-10 BER curves vs. ECC and Scrub Rate 3 *. Figures assuming 22-bit code from “Models and Algorithmic Limits for an ECC-Based Approach to Hardening Sub-100nm SRAMs”, IEEE TNS Vol. 54, pp. 935-945, Aug. 2007.
Comparison of ECC Model with SEU Experimental Data Equation A ~ Equation B ~ (1-bitt ECC) (2-bitt ECC) Raw BER Raw BER Scrub Rate/Raw BER 2 Scrub Rate/Raw BER 1 + 1 + 300 15 Approximate Effective BER Equations for Single-bit-correcting 22-bit word and Double-bit-correcting 15-bit word • Distribution of observed errors from measurements matches the ECC model very well • Proper error correction code (ECC) and modest scrubbing rate combination ensures a BER better than 10-10 errors/bit-day in all orbital scenarios
SBU and MBU analysis vs. Effective LET Single and Multi Bit Upset Distributions vs. Effective LET for LP and SF SRAMs 9LP 9SF • Large differences in the SBU/MBU distribution between LP and SF SRAMs for similar LET values • Particularly noticeable for LET> ~10 (MeV-cm2/mg) • Saturating cross-sections have LET-dependent error distributions • LET of 31 and 117 have comparable cross-sections but different distributions
SEU Proton Testing • IBM 90nm commercial density SRAM cells have a very low upset threshold • From 3D TCAD simulations, worst-case Qcrit ~1.1fC • With an SRAM cell threshold LET < 0.5 MeV.cm2/mg, protons could potentially become capable of inducing SEUs through direct ionization • Possible drastic increase in raw memory cell BER • Could flood 1bit and possibly even 2 bit ECC schemes Data collected at Indiana University cyclotron facility, 200Mev line. Proton flux ~ 1010particles/cm2.s. Max TID for each tested part ~ 20Krad. • SRAM saturating cross-section still well behaved for worst-case 200Mev proton exposure (~ 10-14 errors/bit.cm2) • Proton upsets from nuclear interactions, no direct ionization yet @ 90nm Saturating cross-section of 9SF and 9LP SRAMs for a 200MeV proton exposure.
Latchup in 90nm SRAMs • SF appears to be SEL immune • LP appears to be SEL immune if, and only if: • At room temperature over voltages up to 110% Vcore, OR • At lowered voltage over temperatures up to 125 C • All SELs observed in LP were non-destructive • LP latchup appeared as a single step-function of ~50 mA Data collected at LBNL 88” Cyclotron, 10MeV cocktail. High T applied through RTD strapped to PGA package and PID control
9LP/9SF TID and Room T0Annealing Responses 9LP SRAM: - ~ 1000X increase in Core leakage current @ 2Mrad - 4/4 devices functional failure [1000 <X<1300] krads - 4/4 devices fully functional after 7 days annealing - Leakage ~ 30X after 140 days 9SF SRAM:- ~ 50X increase in Core Leakage current @ 2Mrad - 20X pre-rad leakage but same level as LP @ 2Mrad - 4/4 devices functional failure [600<X<1000] krads - 2/4 devices fully functional after 7 days annealing - Leakage ~ 8X after 140 days • TID-induced Core leakage currents of Baseline and Hardened SRAMs were identical for a given process • TID response of SRAM Core is dominated by memory array leakage 9LP and 9SF SRAM Core leakage currents dynamics as a function of TID and 24C anneal All devices irradiated @ 200 rads/sec, Max Temp < 30 C, removed ~15 minutes for measurement LP irradiated under 10 pattern & measured under 01; SF irradiated under 00 pattern & measured under 11
9LP/9SF TID and Room T0Annealing Responses (cont.) • TID-induced IO leakage currents showed drastic differences between hardened and unhardened IO pads • Hardened pads should be used whenever possible • Major impact on reliability at negligible performance penalties 9LP and 9SF SRAM IO leakage currents dynamics as a function of TID and 24C anneal 9LP SRAM: - Hardened IO pads (MRC design)- ~ 1mA IO leakage up to 2Mrad 9SF SRAM:- Unhardened IO pads (Artisan cells) - ~ 104 X increase in IO leakage @ 2Mrad - Leakage ~ 2000X after 140 days All devices irradiated @ 200 rads/sec, Max Temp < 30 C, removed ~15 minutes for measurement LP irradiated under 10 pattern & measured under 01; SF irradiated under 00 pattern & measured under 11
9LP/9SF TID Responses for 100C and 150C anneals • All 9LP and 9SF SRAMs respond very well to a temperature annealing • For 100C, Core leakage currents are within 3X of pre-rad < 100 hours • For 150C, pre-rad Core leakage levels are reached within 5 hours 9LP and 9SF SRAM Core leakage current variation as a function of annealing temperature • The unhardened IO did not respond well • 100C anneal is ineffective: still ~ 1000X above pre-rad after 200 hours • 150C anneal is slightly better with 10X above pre-rad after 60 hours
TID Radiation Hysteresis • Successive TID exposure and annealing cycles induced a shift in the SRAM leakage characteristics: • Lateral shift: the SRAM start degrading sooner than in its first irradiation • Vertical shift: the current “saturation” level is lowered • But the true mystery improvement to the SRAM reliability is… All SRAMs re-exposed a second time NEVER exhibited functional failure up to 2Mrad
Summary and Conclusions • Single Event Upsets (SEU) and Bit-Error-Rate (BER) • Proper ECC strength, bit interleaving and modest scrubbing rate combination ensures an SRAM BER better than 10-10 errors/bit-day in all space environments investigated. • Single Event Latch-up (SEL) • 9SF commercial memory cells are latch-up immune. • 9LP commercial memory cells are latch-up free only under high temperature or high voltage, but not both. • Voltage scaling is likely to mitigate SEL concerns for core voltages < 1.1V. • Total Ionizing Dose (TID) • 90nm SRAMs showed to be intrinsically resilient up to 300krad, but a substantial static leakage increase happens past 500krad (10X). • TID Hardened IO pads should be used whenever possible. • 9LP/9SF SRAMs are very responsive to temperature treatments: • All SRAMs regained pre-rad nominal currents within 5 hours of 150°C annealing after 2Mrad TID exposure. • All ICs recovered from catastrophic loss of functionality. • Successive exposure and annealing cycles induced hysteresis in the SRAM leakage characteristics: • The current degradation starts earlier • However, the maximum leakage at 2Mrad is lower than for the first irradiation. • All ICs that underwent complete thermal anneal NEVER exhibited functional failure up to 2Mrad when re-exposed.