160 likes | 171 Views
Self-* Systems CSE 598B. Paper title: Dynamic ECC tuning for caches Presented by: Niranjan Soundararajan. Abstract. On-chip caches are increasing in sizes Protection needed in order to store correct data. ECC serves as an efficient means to protect the data ECC has its own overhead
E N D
Self-* SystemsCSE 598B Paper title: Dynamic ECC tuning for caches Presented by: Niranjan Soundararajan
Abstract • On-chip caches are increasing in sizes • Protection needed in order to store correct data. • ECC serves as an efficient means to protect the data • ECC has its own overhead • Area: Extra space for its logic • Latency: ECC computations take time • This work deals with reducing latency involved in ECC computation. • Track the cache lines frequently accessed. • Dynamically turn ECC computation on and off for specific cache lines.
Background • Information redundancy Data in the processing core is protected by schemes such as RMT (Redundant Multi-threading) [1][2]. ECC protection is easy to implement for on-chip caches. Also size of the caches prevent them from being replicated [3]. • Current evaluation shows raw FIT (Failures In Time) rate numbers for latches and SRAM cells to vary between 0.001 – 0.01 FIT/bit. This value increases with elevation. At 1.5 km – FIT/bit is 3.5x while at 10 km (airplanes) – FIT/bit is 100x larger [4][5][6][7].
Background • As processor power dissipation becomes more and more important, supply voltages get reduced. This will greatly increase the FIT/bit [8][9]. • “As an example, consider a 32 MB data cache. This cache has 222 quad-words. Let us assume that an SRAM cell has an average FIT rate of 0.001. The single-bit FIT rate for the entire cache is 0.001 * 222 * 72 = 3.02 * 105,i.e. the MTTF is 109 / (3.02 * 105) = 3311 hours” [3]. • Consider the case of large multiprocessor systems with tens of megabytes of caches. Protection becomes an important issue if the systems are involved in critical computations like space research and flight control.
Background • All these data point out that cache data must be protected. ECC is the best way to protect SRAM. • This work addresses some of the problems related to applying ECC for caches that need to operate at low latency like the L1 caches.
Motivation • ECC overhead [10] • Increase in area due to circuitry – 11% (approx. 15mm2) • Increase in latency – 10% (approx 5 ns) • Applications show temporal locality in accessing cache lines. By dynamically turning ECC on and off for cache lines, latency of cache access gets reduced. Since the frequency of operations is going to be high, the time between individual accesses is going to be less.
Motivation • Chance of error affecting data is less • Due to frequency of operations • Cache lines with high temporal locality is less compared to total number of cache lines.
Self-Tuning Implementation • Keep track of cache line access. After every 5000 cycles, tune the ECC of cache lines to turn on or off. • Overhead: • Keeping track of cache line access: Simple, fast counters make implementation easy. • Tuning ECC for lines: Simple average computation and turning ECC on for lines with more activity than the average.
Implementation • Implementation simplified if • Counters maintained for a set of cache lines. • ECC tuning done at this granularity. • Granularity can be at 10, 20 … 100 lines.
Self-tuning Results • From the graphs we see the temporal locality. Based on these results, ECC was turned off for the lines with high locality.
Conclusion • ECC is indispensable as chip reliability reduces and maintaining correct data becomes an issue. • Processor-Memory bottleneck is an eternal issue. Increasing cache latency through ECC protection creates further problems. • This work tries to reduce cache (protected by ECC) latency using a scheme to dynamically turn ECC on and off.
References • [1] S. S. Mukherjee, M. Kontz, and S. K. Reinhardt, “Detailed Design and Evaluation of Redundant Multithreading Alternatives,” ISCA, 2002. • [2] S. S. Mukherjee, C. T. Weaver, J. Emer, S. K. Reinhardt, and T. Austin, “A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor,” MICRO, December 2003. • [3] S. S. Mukherjee, T. Fossum, J. Emer, and S. K. Reinhardt, “Cache Scrubbing in Microprocessors: Myth or Necessity?” 10th International Symposium on Pacific Rim Dependable Computing (PRDC), Papeete, Tahiti, March 2004. • [4] J.F.Ziegler, “Terrestrial cosmic rays,” IBM J. of Research and Development, pp. 19 – 39, Vol. 40, No. 1, Jan. 1996. • [5]Y.Tosaka, S.Satoh, K.Suzuki, T.Suguii, H.Ehara, G.A.Woffinden, and S.A.Wender, “Impact of Cosmic Ray Neutron Induced Soft Errors, on Advanced Submicron CMOS circuits,” VLSI Symposium on VLSI Technology Digest of Technical Papers, 1996.
References • [6] T.Karnik, B.Bloechel, K.Soumyanath, V.De, and S.Borkar, “Scaling trends of Cosmic Rays induced Soft Errors in static latches beyond 0.18µ ,” Symposium on VLSI Circuits Digest of Technical Papers, 2001. • [7] S.Hareland, J. Maiz, M.Alavi, K.Mistry, S.Walstra, and C.Dai, “Impact of CMOS Scaling and SOI on soft error rates of logic processes,” Symposium on VLSI Technology Digest of Technical Papers, 2001. • [8]Robert Baumann, “Soft Errors in Commercial Semiconductor Technology: Overview and Scaling Trends,” IEEE 2002 Reliability Physics Tutorial Notes, Reliability Fundamentals, pp. 121_01.1 – 121_01.14, April 7, 2002. • [9] P.Shivakumar, M.Kistler, S.W.Keckler, D.Burger, and L.Alvisi, “Modeling the Effect of Technology Trends on the Soft Error Rate of Combinatorial Logic,” Dependable Systems and Networks, 2002. • [10]H. L. Kalter et al., “A 50-ns 16-Mb DRAM with a 10 ns data rate and on-chip ECC,” IEEE J. Solid-State Circuits, vol. 25, pp. 1118–1128, Oct. 1990.