230 likes | 334 Views
Increasing the Energy Efficiency of TLS Systems Using Intermediate Checkpointing. 1 University of Manchester 2 University of Edinburgh 3 Intel Labs Barcelona - UPC. Salman Khan 1 , Nikolas Ioannou 2 , Polychronis Xekalakis 3 and Marcelo Cintra 2. Introduction.
E N D
Increasing the Energy Efficiency of TLS Systems Using Intermediate Checkpointing 1 University of Manchester 2 University of Edinburgh 3 Intel Labs Barcelona - UPC Salman Khan1, Nikolas Ioannou2, Polychronis Xekalakis3 and Marcelo Cintra2
Introduction • Power efficiency, complexity and time-to-market reasons lead to CMPs • Problem: • No benefits for sequential applications • Even for mostly parallel applications Amdahl’s Law limits performance gains with many cores • Solution: Thread Level Speculation(TLS) • But performance through TLS costs in energy Can we reduce the wastefulness of re-execution due to misspeculation without losing performance? HiPC 2011
Key Contributions • Propose checkpointing to improve efficiency of speculative execution • Evaluate dependence prediction techniques to guide checkpoint placement • Our approach results in an energy saving of up to 14%, with 7% on average over normal TLS execution, with no significant effect on speedup. HiPC 2011
Outline • Introduction • Checkpointing • Dependence Predictors • Checkpointing Policy • Experimental Setup and Results • Conclusions HiPC 2011
Thread Level Speculation HiPC 2011
Outline • Introduction • Checkpointing • Dependence Predictors • Checkpointing Policy • Experimental Setup and Results • Conclusions HiPC 2011
Placing Checkpoints • Stride • Dependence Prediction • Address based • Program Counter Based • Hybrid HiPC 2011
Dependence Prediction HiPC 2011
Hybrid Dependence Predictor HiPC 2011
Outline • Introduction • Checkpointing • Dependence Predictors • Checkpointing Policy • Experimental Setup and Results • Conclusions HiPC 2011
Placing Checkpoints • Limited number of checkpoints • Placing a checkpoint has a cost • Checkpointing on every positive prediction results in too many checkpoints HiPC 2011
Outline • Introduction • Checkpointing • Dependence Predictors • Checkpointing Policy • Experimental Setup and Results • Conclusions HPCA 2010
Setup • Simulator, Compiler and Benchmarks: • SESC (http://sesc.sourceforge.net/) • POSH (Liu et al. PPoPP ‘06) • Spec 2000 Int. • Architecture: • Four way CMP, 4-Issue cores • 16KB L1 Data (multi-versioned) and Instruction Caches • 1MB unified L2 Caches • Cycles from Violation to Kill/Restart: 12 • Cycles to Spawn: 12 HiPC 2011
Measuring Dependence Prediction HiPC 2011
Wasted Instructions: Unnecessarily squashed instructions. HiPC 2011
Outline • Introduction • Checkpointing • Dependence Predictors • Checkpointing Policy • Experimental Setup and Results • Conclusions HPCA 2010
Conclusions • Effective checkpointing improves the efficiency of TLS • Placing checkpoints by stride is not sufficient to reduce waste significantly • Checkpointing using dependence predication obtains energy saving of up to 14%, with 7% on average over normal TLS execution, with no significant effect on speedup. HiPC 2011
Read the paper for… • Complete results • Microarchitectural issues that arise from checkpointing running tasks • Modified squash/restart mechanism that is needed to avoid performance degradation from checkpointing HiPC 2011