210 likes | 360 Views
CS 7810 Lecture 13. Pipeline Gating: Speculation Control For Energy Reduction S. Manne, A. Klauser, D. Grunwald Proceedings of ISCA-25 June 1998. Cost of Speculation. Mispredict rates . 9.9. 12.2. 23.9. 10.4. 6.9. 4.6. 11.3. 1.7. Pipeline Gating.
E N D
CS 7810 Lecture 13 Pipeline Gating: Speculation Control For Energy Reduction S. Manne, A. Klauser, D. Grunwald Proceedings of ISCA-25 June 1998
Cost of Speculation Mispredict rates 9.9 12.2 23.9 10.4 6.9 4.6 11.3 1.7
Pipeline Gating • Low confidence branches throttle instr fetch until they are resolved • Pipeline gating usually lasts for fewer than five cycles
Metrics • SPEC (specificity): fraction of all mispredicted • branches detected as low-confidence by the • confidence estimator (coverage) • PVN (predictive value of a negative test): probability • of a low-confidence branch being incorrectly • branch-predicted (accuracy)
Confidence Estimators • Perfect: to gauge potential benefits • Static: branches that have low prediction rates • JRS: if a branch has yielded N successive correct • predictions, it has high confidence • Saturating counters: unbiased counter value or • disagreement in two predictors low confidence • Distance: mpreds are clustered, hence the first 4 • branches after a mispredict have low confidence
SPEC and PVN SPEC (coverage): mispred branches detected by low-confidence estimator PVN (accuracy): % of low-confidence branches that are branch mpreds • It is easier to achieve a high SPEC value than PVN • A high PVN value can be achieved by using N low-confidence branches • to invoke gating – if PVN is 30%, re-defining low-confidence as two • low-confidence branches increases PVN to 51%
Results • Can gating improve performance? – only if cache • pollution is significant • Less than 1% performance loss and up to 38% • reduction in extra work • Energy consumption could go up – some work is • independent of number of executed instrs (clock • distribution) – incr. execution time can incr. Energy • Pipeline gating should reduce power consumption
CS 7810 Lecture 13 Cache Decay: Exploiting Generational Behavior to Reduce Cache Leakage Power S. Kaxiras, Z. Hu, M. Martonosi Proceedings of ISCA-28 July 2001
Leakage Power Trends • Circuit delay a 1/(V – Vth) • Leakage a num transistors (incr) • supply voltage (decr) • (exp) low thresh. voltage (incr) • L1 and L2 caches are the biggest • contributors (high transistor budgets)
Vdd-Gating • Leakage can be reduced by gating off the • supply voltage to the circuit • When applied to a cache, the contents of the • SRAM cell are lost • Cache decay: apply Vdd-gating when you do not • care about cache contents
Overheads • Hardware to determine when to decay • Introduces additional cache misses • Normalized cache leakage power = • Activeratio (fraction of cache that is powered on) + • (Counter overhead : Leak) x activity + • (L2 access energy : Leak) x num-misses • Increased execution time (< 0.7%) • L2 access/leakage ratio is ~9
Skier’s Dilemma New skis: $400 Ski rentals: $20 Heuristic: Buy skis after rental cost = purchase price Ski trips: 5 10 15 20 25 50 Optimal: $100 $200 $300 $400 $400 $400 Heuristic: $100 $200 $300 $800 $800 $800 Likewise, decay a cache line when the cost of an additional miss equals leakage dissipated so far
Tracking Dead Time • Each line has a 2-bit counter that gets reset on • every access and gets incremented every 2500 • cycles through a global signal (negligible overhead) • After 10,000 clock cycles, the counter reaches • the max value and triggers a decay • Adaptive decay: Start with a short decay period; • if you have a quick miss, double the period; if there • is no miss, halve the period
Other Results • L2 cache is equally suitable to decay techniques • -- lifetimes are scaled by a factor of 10, an extra • miss also costs a lot more • For their experiments, there is little interference • from multiprogramming • Some instructions can easily be identified as • last touches to a cache block – potential for early • cache decay • Can this apply to bpred, register file?
Title • Bullet