1 / 31

Dynamic Prediction of Architectural Vulnerability from Microarchitectural State

Dynamic Prediction of Architectural Vulnerability from Microarchitectural State. Kristen R. Walcott Masters Presentation May 2, 2007. To be presented at ISCA, June 2007. n+. n channel. + -. + -. + -. +-. + -. + -. + -. + -. + -. Energetic Particles Attack!. Neutron. G. D. S.

reilly
Download Presentation

Dynamic Prediction of Architectural Vulnerability from Microarchitectural State

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dynamic Prediction of Architectural Vulnerability from Microarchitectural State Kristen R. Walcott Masters Presentation May 2, 2007 To be presented at ISCA, June 2007

  2. n+ n channel + - + - + - +- + - + - + - + - + - Energetic Particles Attack! Neutron G D S n+ p substrate B

  3. ³ ´ Q i t Q Q c r * * * l l l C S E R C F A i i i Q t t t t c c r o l l r c u o n s a n u x r e a e = c o Transient Faults : least amount of collected charge necessary to produce an upset : Charge collection efficiency = (Collected Charge/Generated Charge) • Voltage decreases -> Qcrit decreases • When Qcrit is low enough, likelihood of transient faults increase

  4. Transient Faults: A Rising Problem • Originally considered a risk only for large memory storage • Latches • Combinatorial logic • Other processor core • structures • Processors are becoming more susceptible to transient faults • Lowered supply voltages • Increasing clock • frequencies

  5. Protecting the Processor Core • Parity and error-correcting codes • Large area • Long delay • High power usage • Redundancy • Hardware • Redundant Multithreading (RMT)

  6. Redundant Multithreading • Successful in protection • Performance degraded: • 30% on single-threaded workloads • 32% on multithreaded workloads Input Replicator Output Comparator Rest of the System

  7. Architectural Vulnerability • Is protection always needed? • Architectural Vulnerability Factor (AVF) • Probability that an internal fault in the structure would result in an externally visible error • Tracking instantaneous AVF expensive

  8. Why Track AVF? • If only we knew… • Balance vulnerability and performance • Balance vulnerability and power • Control application specific RMT • Perform AVF-profiling • and probably much more!

  9. Main Contributions • A demonstration that AVF varies over time and across SPEC2000 benchmarks • A rigorous characterization of AVF based on a thorough statistical analysis • An a posteriori analysis of our AVF predictor quantitatively validates our approach • Proof-of-concept RMT implementation

  10. L B a a c c e e ( ) ( ) = # A V F B L B = a c e a c e Architectural Vulnerability Factor • The probability that a fault in a structure will result in an architecturally visible incorrect execution • Fraction of time that a bit is in a state of Architecturally Correct Execution (ACE) Little’s Law estimates AVF as: : average bandwidth of the ACE bits into the structure : average residence time of ACE bits in the structure

  11. Past Redundancy Techniques • Hardware replication (Bernick et al. 2005, Slegal et al. 1999) • Redundant Multithreading (RMT) • (Mukherjee et al. 2002, Reinhardt and Mukherjee 2000, Reis et al. 2005) • Less hardware necessary • Better performance by dynamically partitioning resources • Partial RMT • Reduce power consumption • Increase performance

  12. Partial RMT Techniques • Performance-Efficient RMT • Exploiting instruction reuse (Parashar et al.,ISCA 2004) • Opportunistic DIE-IRB (Gomaa et al., ISCA 2005) • Using speculation as a form of redundancy (Parashar et al., ASPLOS 2006) • Relaxing input replication (Smolens et al., MICRO 2006) • Power-Efficient RMT • (Madan and Balasubramonian SELSE 2006, Rashid et al. ISCA 2005) • Problem: RMT schemes are oblivious to the actual vulnerability

  13. Experiment Design • SimpleScalar 3.0 • Modified to track AVF • Load/Store Queue (LSQ) • Issue Queue (ISQ) • Reorder Buffer and Physical Register File (RUU) • Modified to perform RMT • Based on Simultaneous and Redundantly Threaded (SRT) design

  14. Experiment Design • 26 SPEC2000 benchmarks • Each benchmark is run for multiple 100-million instruction SimPoints • Checkpoint the entire microarchitectural state (160 variables) and calculate AVF every 4 million instructions (25 “snapshots” per SimPoint)

  15. Experimental Results perlbmk bzip2 art 2 SimPoints measuring RUU AVF AVF can vary significantly over time and across applications

  16. Experimental Results A single metric is likely not enough. More sophisticated prediction mechanism needed! perlbmk bzip2 art RUU AVF IPC

  17. Detecting Correlations • Visual trends • Too many metrics to consider • Correlation may not hold in all cases • Principal Component Analysis • Regression techniques • Linear regression • Quadratic regression

  18. Principal Component Analysis • Optimal linear transformation for projecting data into a new coordinate system • Identified 69 principal components • Four components had substantially higher singular values • R2 values: • 0.61 RUU • 0.47 ISQ • 0.55 LSQ Good results, but spanning eigenvectors are linear combinations of the original 160 performance variables

  19. Searching for Correlations Identify strong correlations between structural AVF values and a small set of easily measurable processor metrics.

  20. 2 P ¯ ¯ f f ¯ f ¯ f ¯ f ¯ f + + + + + y y y ² ² = k k k i i i i i i i 1 2 0 1 1 2 2 i i ; ; : : : : : : Practical AVF Prediction • Regression techniques • Gives a function to predict the value of a dependent variable • Problem: Find the optimal representation of as a weighted combination of basis functions , where each depends only on the values of a subset of the predictor variables (minimize the sum of squared errors ) predictor variables response variable (AVF)

  21. µ ¶ 2 2 1 6 0 f ( ) g 1 x y x y x y x y 1 2 7 2 0 , , , , , ; = 2 Linear Regression • Include the single variable with the highest correlation • Consider all remaining 159 variables in turn • Select the best and repeat, until all variables used. • Select subset for predictor Quadratic Regression • For each pair we perform a nonlinear least-squares fit using the basis • pairs were considered

  22. 2 R R S L I L R R S I I R P P P C C B B l B W c o o o c Resulting Predictors • P1: Multi-variable linear predictor • P2: Quadratic predictor • P3: Linear predictor whose terms are those variables with the highest individual correlation to AVF 0.93 0.74 0.94 -2.27 + (0.71 ) + (0.017 * ) (0.02 ) + (0.55 ) – (27.38 * ) + (2.73 ) -1.07 – (5.03e-9 ) + (1.35 ) – (5.65e-9 ) + (0.01 * ) + (5.77e-9 )

  23. Comparison of Predictors Multivariable Linear Quadratic Simple Linear All 3 predictors are able to capture noisy behavior

  24. AVF-Aware RMT • Every two million cycles, estimate the AVF for LSQ, ISQ, and RUU • Add together for total AVF • If AVF > threshold, generate a processor interrupt and enable RMT • After a fixed number of cycles (10 million), disable RMT

  25. IPC Variations for twolfThreshold = AVF Values Between 14% and 55% performance improvement

  26. Conclusion • AVF does vary and is correlated with the microarchitectural state • AVF can be modeled • High accuracy • Inexpensive quadratic or linear model • Based on a few easily-tracked processor performance variables • Can be used as an online predictor

  27. Future Work • Develop sophisticated toggling scheme • Adaptive AVF thresholds • Non-constant RMT window sizes • Examine tradeoffs • Performance • Power • Fault-tolerance

  28. Questions?

  29. 2 d P S E R i t t t r e r u c q u a o n . . ( ) ( ) P R U U R I P B 2 2 7 0 7 1 0 0 1 7 0 9 3 ¡ + + ¤ 1 o : : : : V D i i t a r e s c r p o n ( ) ( ) ( ) ( ) I S Q R R I P C S 0 2 0 0 5 5 2 7 3 8 2 7 3 0 7 4 . + ¡ + ¤ l B W o : : : : : R R U U t c o u n 9 ¡ ( ) ( ) L S Q L L c 1 0 7 5 0 3 1 3 5 0 9 4 ¡ ¡ + ¡ e c o : : : : l R R U U t a e n c y l 9 ¡ ( ) ( ) R I P B 5 6 5 0 0 1 + + ¤ e c : : R R U U o c c u p a n c y 9 ¡ ( ) S C o 5 7 7 e : l d S T i i t t t t o a n s r u c o n s e x e c u e B W 6 2 ¡ ( ) ( ) ( ) P R U U R R I P C 1 1 4 0 0 0 1 1 4 7 0 1 0 0 9 6 + ¡ + + ¤ e l 2 l : : : : : ( ) l d d i i t t t s p e c u a v e a n c o m m e 2 ( ) ( ) R I P C I P C 0 7 3 0 0 9 ¡ ¤ ¤ l : : L L S Q t c o u n 6 2 ¡ ( ) ( ) c I S Q A S C A S C 2 5 4 8 0 0 2 1 4 1 0 8 1 ¡ ¡ + + e : : : : l L L S Q t a e n c y l ( ) ( ) I P C A S C I P C 1 6 4 2 0 5 1 + ¡ ¤ ¤ : : L L S Q o c c u p a n c y 2 o ( ) I P C 3 4 7 ¤ l b f \ l " l : S C T i t o a n u m e r o s p c y c e s 2 ( ) ( ) ( ) L S Q L L I P C 1 4 9 0 0 3 0 0 0 0 0 3 2 9 2 0 9 4 ¡ + ¡ + + ¤ l l ( ) f : : : : : i i t t t r o m s s u e o r e r e m e n 2 ( ) ( ) L I P C I P C 1 4 1 0 8 0 ¡ ¤ ¤ l b f l l A S C A i : : v e r a g e n u m e r o s p c y c e s ( ) P R U U R 3 0 6 0 7 3 0 9 2 ¡ + 3 I I F Q o : : : o c c u p a n c y o ( ) ( ) ( ) I S Q I L R 0 9 0 4 8 3 0 7 4 0 8 1 0 6 9 ¡ ¡ + + o o o : : : : : ( ) L S Q L 1 2 3 1 4 1 0 9 0 ¡ + o : : : Resulting Predictors

  30. Comparison of Predictors • No one predictor constantly does best • P1 (multivariable linear predictor) gives good compromise • Cost of implementation • Accuracy

  31. More Related Work Fu, X., Poe, J., Li, T., and Fortes, J. A. 2006. Characterizing Microarchitecture Soft Error Vulnerability Phase Behavior. In Proceedings of the 14th IEEE international Symposium on Modeling, Analysis, and Simulation (September 11 - 14, 2006). MASCOTS. IEEE Computer Society, Washington, DC, 147-155. • Show that AVF varies for microarchitecture structures over time and across applications • Attempt to find correlation with 5 individual metrics • Goal to link vulnerability with phase behavior • Consider each metric by itself • Metrics were few in number and exhibited inconsistent correlation to AVF -> Gave up the idea of a predictive approach

More Related