170 likes | 192 Views
Compiler Optimization-Space Exploration. Authors Spyridon Triantafyllis , Manish Vachharajani , Neil Vachharajani , David I. August. Adrian Pop IDA/PELAB adrpo@ida.liu.se. Outline. Introduction The Problem: Predictive Heuristics and A Priori Evaluation Some Solutions:
E N D
Compiler Optimization-SpaceExploration Authors Spyridon Triantafyllis, Manish Vachharajani, Neil Vachharajani, David I. August Adrian Pop IDA/PELAB adrpo@ida.liu.se
Outline • Introduction • The Problem: • Predictive Heuristics and A Priori Evaluation • Some Solutions: • Iterative Compilation and A Posteriori Evaluation • Our Solution • Optimization-Space Exploration • Evaluation • Conclusion
Introduction • Processors • become more complex • incorporate additional computational resources • Consequence • Compilers • become more complex • use aggressive optimizations • have to use predictive heuristics in order to decide where and to what extend optimizations should be applied
The Problem: Predictive Heuristics • Predictive Heuristics • tries to determine a priori the benefits of certain optimization • are tuned to give the highest average performance • The Result • significant performance gains are unrealized!
Some Solutions: Iterative Compilation • Iterative Compilation • optimize the programs in many ways • choose a posteriori the best code version • Pitfall of current schemes • prohibitive compilation times! • limitation to specific architectures • embedded systems • limited to specific optimizations
Our solution: Optimization-Space Exploration • OSE Compiler (Practical Iterative Compilation) • explores the space of optimization configurations through multiple compilations • it uses the experience of the compiler writer to prune the number of configurations that should be explored • uses a performance estimator to not evaluate the code by execution • selects a custom configuration for each code segment • selects next optimization configuration by examining the previous configurations characteristics
OSE – Limiting the Search Space • Optimization Space • derived from a set of optimization parameters • Optimization Parameters • Optimization level • High Level Optimization (HLO) level • Micro-architecture type • Coalesce adjacent loads and stores • HLO phase order • Loop unroll limit • Update dependencies after unrolling • Perform software pipelining
OSE – Limiting the Search Space • Optimization Parameters • Heuristic to disable software pipelining • Allow control speculation during software pipelining • Software pipeline outer loops • Enable if-conversion heuristic for software pipelining • Software pipeline loops with early exists • Enable if conversion • Enable non-standard predication • Enable pre-scheduling • Scheduler ready criterion
OSE – Limiting the Search Space • Compiler Construction-time Pruning • limit the total number of configurations that will be considered at compile time • construct a set S with at most N configurations • S is chosen by determining the impact on a representative set of code segments C as follows: • S’ = default configuration + configurations with non-default parameters • a) run C compiled with S’ on real hardware and retain in S’ only the valuable configurations • b) consider the combination of configurations in S’ as S’’ repeat a) for S’’ and retain only the best N configurations • repeat b) until no new configurations can be generated or the speedup does not improve
OSE – Limiting the Search Space • Characterizing Configuration Correlations • build a optimization configuration tree • critical configurations = conf. at the same level 1. Construct O = set of m most important configurations in S for all code segments in C 2. Choose all oi in O as the successor of the root node. 3. For each configurations oi in O: 4. Construct Ci = {cj: argmax(pj,k) = i} k=1…m 5. Repeat steps 3, 4 to find oi successors limiting the code segments to Ci and configurations to S\O.
OSE – Limiting the Search Space • Compile-time search • do a breadth first search on the optimization configuration tree • choose the configuration that yields the best estimated performance
OSE – Limiting the Search Space • Limit the OSE application • to hot code segments • hot code segments are identified through profiling or hardware performance counters during a program run
Evaluation • OSE Compiler Algorithm 1. Profile the code 2. For each Function: 3. Compile to the high level IR 4. Optimize using HLO 5. For each Function: 6. If the function is hot: 7. Perform OSE on second HLO and CG 8. Emit the function using the best configuration 9. If the function is not hot use the standard configuration
Compile-time Performance Estimation • Model Based on: • Ideal Cycle Count – T • Data cache performance, Lambda, L • Instruction cache performance, I • Branch misprediction, B