170 likes | 319 Views
Discussion of: “Terrestrial-based Radiation Upsets: A Cautionary Tale”. CprE 583 Tony Kuker 12/06/05. Presentation Objective. The objective of this presentation is to present the information found within the paper entitled “Terrestrial-based Radiation Upsets: A Cautionary Tale”
E N D
Discussion of: “Terrestrial-based Radiation Upsets: A Cautionary Tale” CprE 583 Tony Kuker 12/06/05
Presentation Objective • The objective of this presentation is to present the information found within the paper entitled “Terrestrial-based Radiation Upsets: A Cautionary Tale” • This paper was presented at the 13th annual IEEE Symposium on Field-Programmable Custom Computing Machines
Paper Objective • Introduce cosmic radiation induced soft errors • Discuss • Methods of calculating and estimating Soft Error Rates (SER) • Previous SER testing results • Application of SERs to reconfigurable supercomputers • Introduce Mitigation Techniques for FPGA designs
Soft Errors • Computing systems are susceptible to soft errors caused by terrestrial radiation • Neutron radiation-induced upsets can result in bit-flips in SRAM • Neutron radiation is often overlooked in system design
Estimating SER • Computed as a function of • Neutron Flux • Cross Section (sigma)
FPGA Testing Efforts • Xilinx Rosetta Tests • iRoC Technology Tests • Altera Tests
Xilinx Rosetta Project • 10 x 10 grid of Xilinx XC2V6000 chips • Four of these grids were placed at various altitudes
iRoC Technology Test • Similar to Rosetta • Used Particle Accelerator at Los Alamos National Laboratory to perform testing • Included components from Actel, Xilinx and Altera • MTTU for a 100 device setup (like Rosetta):
Altera Test • Altera presented soft error rates for their devices in 2004 • The Altera results were marginally better than the Xilinx Rosetta results
Memory Cells • Testing results of SRAM and DRAM from iRoC and IBM • Error rate for a single bit is relatively small • As sizes increase, so does probability of soft error • Parity, ECC and Chipkill can reduce upset rates
Microprocessors • Vulnerable to upsets in L1, L2 and L3 cache • Most modern processors have soft error protection on caches. • Register files are also vulnerable within the microprocessor
Cray XD1 • Reconfigurable supercomputer utilizing combination of SIMD processors and FPGAs • Integrates Xilinx Virtex-4 FPGAs into the computing environment via software APIs
Mitigation Schemes • Support Logic Methods • SEU Controller • Selective Triple Modular Redundancy • Partial Configuration Methods • Single Frame Correction • Processor-based Detection of Critical Upsets • Scrubbing with CRC Check
Relevance • As computing systems become more complex, terrestrial-based radiation upsets become more and more complex • Awareness of the issue and mitigation techniques may improve reliability of reconfigurable systems
Comments • Good summary of FPGA testing results • Explained calculations effectively • Could have used more discussion of memory soft error mitigation techniques • Flash-backed and anti-fused FPGAs
References • Terrestrial-based Radiation Upsets: A Cautionary Tale, Heather Quinn & Paul Graham, 2005 • Cray XD1 Supercomputer Overview http://www.cray.com