460 likes | 560 Views
John Cavazos Architecture and Language Implementation Lab Thesis Seminar University of Massachusetts, Amherst. Learning for Optimizing Compilers. Compiler writers have a difficult task optimizations are NP-hard computer architectures are complex computer architects need rapid evaluation
E N D
John Cavazos Architecture and Language Implementation Lab Thesis Seminar University of Massachusetts, Amherst Learning for Optimizing Compilers
Compiler writers have a difficult task optimizations are NP-hard computer architectures are complex computer architects need rapid evaluation Generating heuristics manually is slow, complicated, and ad hoc. Motivation
Propose Supervised Learning • Induces heuristics automatically • Training examples • a,b,c,…,z label • a,b,c,…z : properties of problem • label : proper decision to make • Two objectives: • Minimize error • Prefer less complicated function • LOCO (Learning for Optimizing COmpilers)
Benefits of Supervised Learning • Heuristic construction sped up • Determines relative importance of features • Effective heuristics • Comparable to hand-tuned heuristics • Theoretically sound • Traditional approach ad hoc
What Order to Apply Optimizations Phase-ordering heuristics When to Optimize Filters Which Optimization Algorithm to Apply Hybrid Optimizations How to Optimize Priority Functions Taxonomy of Compiler Heuristics
The LOCO Methodology • Determine class of heuristic • Generate raw data • Instrument compiler • Process raw data • Thresholds • Generates training data • Induce heuristic • Integrate into compiler
The LOCO Methodology LOCO Training Set Instrumented Compiler Supervised Learning Production Compiler Generate raw learning data Ruleinduction Processrawdata (Thresholding) Inducesheuristic
Experimental Setup • Java JIT compiler • Jikes RVM 2.0.2 • PowerPC 533 MHz G4, model 7410 • Case Study 1: SPEC JVM benchmarks • Case Study 2: Scientific benchmarks • Scheduling improves by 4% or more
Case Study 1 Hybrid Register Allocation
Motivation • Register Allocation: important • Effective use of registers • Different Algorithms to choose from • Graph coloring: possibly expensive • Linear scan: not always effective • Which algorithm to apply?
Solution • Features predict which algorithm to use • Heuristic function controls allocator • Reduces cost significantly • Retains most benefit • Successful with simple features • Applicable to other optimizations
Inducing Heuristic Controller • For each block generate raw training data • Features of method • Additional spills incurred • Cost of allocation algorithms • Process raw data to generate training set • Leave-one-out cross-validation • Output of LOCO = heuristic controller
Labeling Training Instances • Two factors: • Cost of register allocation • Spill benefit of different allocators • Prefer graph coloring • If benefit above threshold • Prefer linear scan • If graph coloring cost above threshold • No spill benefit
Motivation for Threshold Technique • Noise reduction technique • Simplifies learning • Removes cases of fine distinction • Separation by a threshold gap • For example: • T=10% model estimates improvement by 10%
Thresholding Linear Scan Graph Coloring No Instance Spill Threshold(8192) Cost Threshold (0.5)
Labeling Training Instances If (LS_Spill – GC_Spill > Spill_Threshold) Print “GC”; Else If (LS_Cost/GC_Cost > Cost_Threshold) Print “LS”; Else if (LS_Spill – GC_Spill <= 0) Print “LS”; Else { // No Label } High Spill Benefit High Cost No Spill Benefit Skip Training Instance
Significantly reduce register allocation time Reduced allocation time by 60% Preserve benefit of graph coloring Achieved 93% of graph coloring benefit LOCO effective for this heuristic Hybrid Register Allocation is Successful
Case Study 2: Instruction Scheduling Filters
Motivation • Instruction scheduling: important • Improvements over 15% • But: • Expensive • Frequently not beneficial • Problem: Can we predict which blocks benefitfrom scheduling?
Solution • Features of block predict when to schedule • Heuristic controls scheduling • Reduces cost of scheduling • Retains benefit of scheduling • Successful with simple features • Filter for applying scheduler
Construct cheap-to-compute features of a block Obtain training instances that include: Features of the block Labels (Scheduling benefit to block) Induce a filter using LOCO We used rule induction Use the filter to control when compiler schedules Inducing a Filter
Block Timing Estimator • Estimate of cycles to execute block • Simple model of real machine • Determines cost of block in isolation • Relative cycle differences important • Not absolute cycle counts
Significantly reduce scheduling time Reduced scheduling time by 75% Preserve benefit of scheduling Achieved 93% of scheduling benefit LOCO effective for this heuristic Filters are Successful
Supervised learning Loop-unrolling and tiling Genetic algorithms Hyperblocks, reg allocation, prefetching (MIT) Application-specific compilation strategy (Rice) Reinforcement learning Used to induce heuristic for scheduling (UMass) We argue LOCO is better Related Work
More work on filters Inlining and SSA-based opts More work on hybrid optimizations Garbage collection More work on priority functions Register allocation spill heuristic Use LOCO anywhere a heuristic is used Future Work
LOCO effective at constructing heuristics Faster than most alternatives LOCO can lead to insights More readable than other alternatives LOCO heuristics competitive Comparable to hand-tuned heuristics LOCO easier to use Conclusion