Speculative instruction validation for performance-reliability trade-off

Speculative instruction validation for performance-reliability trade-off Sumeet Kumar SUNY Binghamton Binghamton, NY 13902 skumar1@binghamton.edu Aneesh Aggarwal SUNY Binghamton Binghamton, NY 13902 aneesh@binghamton.edu caps.cs.binghamton.edu

Cosmic/Alpha Radiation CLK Latch Latch Logic Soft Error What are Soft Errors? 0001 0000 0000 caps.cs.binghamton.edu

Micro-architectural Techniques to Detect Soft Errors • Execute multiple copies of a program • Redundant Multi Threading (RMT) • Probabilistic fault detection techniques • Errors are flagged if the program behavior is out of the ordinary (i.e. Unpredictable) • Probabilistic techniques may have high false alarms, e.g. when instructions do not have predictable behavior caps.cs.binghamton.edu

Simultaneous and Redundantly Threaded (SRT) • SRT is an implementation of RMT in SMT environment • Two copies of a program run simultaneously on a single core • Slack is provided between the two copies for better performance. • Thread running ahead is known as the Main thread, one running behind is known as the Redundant thread • Provides complete fault coverage • Has considerable performance impact (our experiments show 25% performance impact) caps.cs.binghamton.edu

Schematic Diagram of SRT ROB Arch Register Files Register File M Map Table Fetch Buffer Issue Queue M R M M M Fetch Decode Rename Writeback Compare Commit FU R R R M M R Map Table LVQ SVQ R Data Cache M - Main Thread M LSQ R – Redundant Thread LVQ – Load Value Queue R R SVQ – Store Value Queue ECC Protected

Performance-Reliability Trade-off in RMT • Reducing redundancy by reacting to processor state • Avoiding redundant in high IPC phase (PER-IRTR) • RMT toggling • Reducing redundancy by exploiting instruction properties • Instruction Reuse concept (DIE-IRB) • Removing backward slices of silent stores, dead values (SS-mod) or predictable stores (SlicK) caps.cs.binghamton.edu

SpecIV (Speculative Instruction Validation) caps.cs.binghamton.edu

Basic Idea • An instruction validator(similar to data value predictor) is used to store the expected result values of the main thread instructions • Instructions producing values that match the stored value are known as successfully validated instructions • Successfully validated instructions are not redundantly executed caps.cs.binghamton.edu

Schematic Diagram for SpecIV Physical Register File Arch Register Files OFB – Operand Forward Buffer CVQ – Commit Value Queue Dependent on Non executing redundant Instruction Dependent on Executing redundant Instruction Redundant Instruction dropped Fetch Buffer Issue Queue M R M M M Fetch Decode Rename Compare Commit R R R M M M M R R LVQ SVQ CVQ Instruction Validator OFB OFB OFB R R R R M Re-execute bit-vector 0 1 0

Undetected Errors in SpecIV Correct Value Erroneous Value Validator Value Undetected Error Inst X 10 11 11 Error Detected Inst X 10 11 ≠11 Undetected Error Inst X 10 11 10 11 Erroneous Values Only Interested in Single Event Upsets • Errors in OFB and CVQ will be detected, as they are used by redundant thread only

Fault Injection to Measure Vulnerability Source Architectural Register Arch. Register File Register File Source Physical Register ROB Operand Value Map Table Result Value Arch. Register File Decoder Register File Decoder Rename Table Decoder Issue Queue Fetch Decode Rename Writeback Commit FU LSQ caps.cs.binghamton.edu

Hardware Setup for Experimental Results • ROB – 164 Entries • Physical Register File – 128 Int/ 128 Float • Fetch/Decode/Commit Width – 8 • Issue Width – 5 Int/ 3 Float • Issue Queue – 48 Int/ 32 Float • Branch Predictor – Bimodal 4K entries caps.cs.binghamton.edu

Performance Results for SpecIV Instruction Validator Size – 4K Entries IPC caps.cs.binghamton.edu

Instruction Redundancy Reduction Average Reliability Results for SpecIV Average caps.cs.binghamton.edu

Sensitivity to Validator Size Performance Impact Reduction Error Rates caps.cs.binghamton.edu

Performance-Reliability Trade-Off Exploration with SpecIV Performance – Reliability Trade-Offs Performance Trade-Off for Better Reliability Reliability Trade-Off for Better Performance Low Performance Impact High Performance Impact Low Reliability Impact High Reliability Impact Avoiding Redundancy for Producers of Successful Validations Avoiding Low Confidence Validations Multi-Value Validator Result Width & Stride Width Validation Partial Result Validation caps.cs.binghamton.edu

Avoiding Low Confidence Validations(Low Performance Impact) • By stopping validations for entries with no stride the total error rate reduces from 0.45% to 0.23% with negligible performance impact • No additional hardware required to implement this technique Non-Control Instructions Average caps.cs.binghamton.edu

Avoiding Redundancy for Producers of Successfully Validated Instructions(Low Reliability Impact) RBIT 0 Validation Unsuccessful Inst A, R3 op IMM R1 2 1 Validation Successful Inst B, R1 op IMM R30 30 31 Redundant execution reduced by 69% Performance Impact Reduction Increases to 58% Undetected error rate increases to 0.5% Re-execute Bit Vector 1 0 0 caps.cs.binghamton.edu

Conclusion • We propose SpecIV as an effective scheme to achieve performance-reliability trade-off • SpecIV achieves significant reduction in redundant execution, which leads to impressive performance improvement of SRT technique • SpecIV has very small undetected error rate • We also explore the performance-reliability trade-off design space with schemes based on SpecIV, obtaining further performance as well as reliability gains caps.cs.binghamton.edu

Thank You Sumeet Kumar skumar1@binghamton.edu Aneesh Aggarwal aneesh@binghamton.edu caps.cs.binghamton.edu

Speculative instruction validation for performance-reliability trade-off

Speculative instruction validation for performance-reliability trade-off

Presentation Transcript

Performance Price Trade-Off PPT

Chamber Design Performance Validation

Performance instruction for switches and sockets

Performance Price Trade-Off (PPT)

Trade-off Analysis

Dynamic Performance Tuning for Speculative Threads

Speculative instruction validation for performance-reliability trade-off

Exploration-Exploitation trade-off

Finishing off reliability

Performance and Reliability 101

RF System Improvements for Performance and Reliability

Trade-off: Efficiency

Research Instruction for Off Campus/Internet Students

Validation of Performance Measures for PMHPs

Physics validation : kick-off meeting

Reliability and Performance

Trade-Off Analysis

Off-Key Performance

Performance-Based Instruction