140 likes | 236 Views
A Cost-effective Substantial-impact-filter Based Method to Tolerate Voltage Emergencies. Songjun Pan 1,2 , Yu Hu 1 , Xing Hu 1,2 , and Xiaowei Li 1 1 Key Laboratory of Computer System and Architecture I nstitute of C omputing T echnology Chinese Academy of Sciences
E N D
A Cost-effective Substantial-impact-filter Based Method to Tolerate Voltage Emergencies Songjun Pan1,2, Yu Hu1, Xing Hu1,2, and Xiaowei Li1 1Key Laboratory of Computer System and ArchitectureInstitute of Computing Technology Chinese Academy of Sciences 2Graduate University of Chinese Academy of Sciences
Outline Background and Motivation Voltage Emergency Analysis Substantial-impact-filter Based Method Experimental Results Conclusions
Background Shrinking feature size is affecting transistor behaviors Variations Dynamic Static Static Voltage Temperature Process Process Temperature
Voltage Emergencies Voltage emergencies (VE) Slow down logical operation Cause timing violations and affect system reliability Traditional tolerance technologies Set a conservative timing margin [2] Trigger a program rollback if occur [5] Nominal Voltage Operating margin Vth Voltage emergencies [2] N. James, et. al. “Comparison of split-Versus Connected-Core Supplies in the POWER6TM Microprocessor,” In ISSCC 2007. [5] M. Gupta, et. al. “DéCoR: A Delayed Commit and Rollback Mechanism for Handling Inductive Noise in Processors,” In HPCA 2008.
Motivation • Key observation: not all voltage emergencies will affect program execution • Basic idea: Only handle the voltage emergencies having adverse effect on program execution Substantial impact
Voltage Emergency Analysis • Voltage emergencies Intermittent timing faults • Substantial impact faults • Propagate to storage cells • Change architecturally correct execution (ACE) bits Capture a wrong data
Quantitative analysis IVF: Intermittent Vulnerability Factor, extending from [12] Percentage of substantial-impact VE in different structures Pnum- (Ndead + Nun-ACE) NUMtotal = IVFitf Voltage Emergency Analysis • NUMtotal: total number of VE • Pnum: the number of VE propagating to storage structures • Ndead : affect dead values • Nun-ACE: not change ACE bits Masked [12] S. Pan, Y. Hu, and X. Li, “IVF: Characterizing the Vulnerability of Microprocessor Structures to Intermittent Faults,” In DATE, 2010.
Substantial-impact-filter Based Method • Floorplan of our method • Delay sensor: a VE occurs ? • Fault filter: a substantial-impact VE ?
≠ Fault Filter • Filter structure • Architecture level masking WRITE VE ROLLBACK E E Δt=1/2cycle
Experimental Setup Wattch: power estimation Matlab: model power delivery subsystem (implement a second order linear model) Synopsis Design Compiler: area overhead analysis Alpha-power model [21] : compute path delay Workload 16 SPEC2000 benchmarks (10INT, 6FP) Simulate 100M instructions with SimPoint [21] T. Sakurai, et al. “Alpha-power law MOSFET model and its applications to CMOS inverter delay and other formulas,” Journal of Solid-State Circuits, 1990.
Exclude Ndead and Nun-ACE Include Ndead and Nun-ACE 36.4% 16.6% Pnum- (Ndead + Nun-ACE) NUMtotal = IVFitf 31.7% 14.8% Upper bound IVF Refined IVF Experimental Results • IVF for load/store queue and register file
Experimental Results • Comparison of three methods • Once-occur-then-rollback method • DéCoR method [5] • Our proposed method 57% [5] M. Gupta, et. al. “DéCoR: A Delayed Commit and Rollback Mechanism for Handling Inductive Noise in Processors,” In HPCA 2008.
Conclusions We obverse that less than 40% voltage emergencies affect program execution IVF: Quantitative analysis Propose a substantial-impact-filter based method to tolerate voltage emergencies Structure independent Reduce performance overhead significantly Gain back 57% performance loss
Thank You for Your Attention Questions?