240 likes | 346 Views
PREDICTING UNROLL FACTORS USING SUPERVISED LEARNING. Mark Stephenson & Saman Amarasinghe Massachusetts Institute of Technology Computer Science and Artificial Intelligence Lab. INTRODUCTION & MOTIVATION. Compiler heuristics rely on detailed knowledge of the system
E N D
PREDICTING UNROLL FACTORS USING SUPERVISED LEARNING Mark Stephenson & Saman Amarasinghe Massachusetts Institute of Technology Computer Science and Artificial Intelligence Lab
INTRODUCTION & MOTIVATION • Compiler heuristics rely on detailed knowledge of the system • Compiler interactions not understood • Architectures are complex
HEURISTIC DESIGN • Current approach to heuristic development is somewhat ad hoc • Can compiler writers learn anything from baseball? • Is it feasible to deal with empirical data? • Can we use statistics and machine learning to build heuristics?
CASE STUDY • Loop unrolling • Code expansion can degrade performance • Increased live ranges, register pressure • A myriad of interactions with other passes • Requires categorization into multiple classes • i.e., what’s the unroll factor?
ORC’S HEURISTIC (UNKNOWN TRIPCOUNT) if (trip_count_tn == NULL) { UINT32 ntimes = MAX(1, OPT_unroll_times-1); INT32 body_len = BB_length(head); while (ntimes > 1 && ntimes * body_len > CG_LOOP_unrolled_size_max) ntimes--; Set_unroll_factor(ntimes); } else { … }
ORC’S HEURISTIC (KNOWN TRIPCOUNT) } else { BOOL const_trip = TN_is_constant(trip_count_tn); INT32 const_trip_count = const_trip ? TN_value(trip_count_tn) : 0; INT32 body_len = BB_length(head); CG_LOOP_unroll_min_trip = MAX(CG_LOOP_unroll_min_trip, 1); if (const_trip && CG_LOOP_unroll_fully && (body_len * const_trip_count <= CG_LOOP_unrolled_size_max || CG_LOOP_unrolled_size_max == 0 && CG_LOOP_unroll_times_max >= const_trip_count)) { Set_unroll_fully(); Set_unroll_factor(const_trip_count); } else { UINT32 ntimes = OPT_unroll_times; ntimes = MIN(ntimes, CG_LOOP_unroll_times_max); if (!is_power_of_two(ntimes)) { ntimes = 1 << log2(ntimes); } while (ntimes > 1 && ntimes * body_len > CG_LOOP_unrolled_size_max) ntimes /= 2; if (const_trip) { while (ntimes > 1 && const_trip_count < 2 * ntimes) ntimes /= 2; } Set_unroll_factor(ntimes); } }
SUPERVISED LEARNING • Supervised learning algorithms try to find a function F(X) → Y • X : vector of characteristics that define a loop • Y : empirically found best unroll factor 1 2 3 4 Loops Unroll Factors 5 6 7 8 F(X)
EXTRACTING THE DATA • Extract features • Most features readily available in ORC • Kitchen sink approach • Finding the labels (best unroll factors) • Added instrumentation pass • Assembly instructions inserted to time loops • Calls to a library at all exit points • Compile and run at all unroll factors (1.. 8) • For each loop, choose the best one as the label
LEARNING ALGORITHMS • Prototyped in Matlab • Two learning algorithms classified our data set well • Near neighbors • Support Vector Machine (SVM) • Both algorithms classify quickly • Train at the factory • No increase in compilation time
NEAR NEIGHBORS # FP operations # branches unroll don’t unroll
SUPPORT VECTOR MACHINES • Map the original feature space into a higher-dimensional space (using a kernel) • Find a hyperplane that maximally separates the data
# FP operations # FP operations # branches2 # branches SUPPORT VECTOR MACHINES unroll don’t unroll
PREDICTION ACCURACY • Leave-one-out cross validation • Filter out ambiguous training examples • Only keep obviously better examples (1.05x) • Throw away obviously noisy examples
FEATURE SELECTION • Feature selection is a way to identify the best features • Start with loads of features • Small feature sets are better • Learning algorithms run faster • Are less prone to overfitting the training data • Useless features can confuse learning algorithms
FEATURE SELECTION CONT.MUTUAL INFORMATION SCORE • Measures the reduction of uncertainty in one variable given knowledge of another variable • Does not tell us how features interact with each other
FEATURE SELECTION CONT.GREEDY FEATURE SELECTION • Choose single best feature • Choose another feature, that together with the best feature, improves classification accuracy most …
RELATED WORK • Monsifrot et al., “A Machine Learning Approach to Automatic Production of Compiler Heuristics.” 2002 • Calder et al., “Evidence-Based Static Branch Prediction Using Machine Learning.” 1997 • Cavazos et al., “Inducing Heuristic to Decide Whether to Schedule.” 2004 • Moss et al., “Learning to Schedule Straight-Line Code.” 1997 • Cooper et al., “Optimizing for Reduced Code Space using Genetic Algorithms.” 1999 • Puppin et al., “Adapting Convergent Scheduling using Machine Learning.” 2003 • Stephenson et al., “Meta Optimization: Improving Compiler Heuristics with Machine Learning.” 2003
CONCLUSION • Supervised classification can effectively find good heuristics • Even for multi-class problems • SVM and near neighbors perform well • Potentially have big impact • Spent very little time tuning the learning parameters • Let a machine learning algorithm tell us which features are best
T H E N D T H E N D E E
SOFTWARE PIPELINING • ORC has been tuned with SWP in mind • Every major release of ORC has had a different unrolling heuristic for SWP • Currently 205 lines long • Can we learn a heuristic that outperforms ORC’s SWP unrolling heuristic?
HURDLES • Compiler writer must extract features • Acquiring labels takes time • Instrumentation library • ~2 weeks to collect data • Predictions confined to training labels • Have to tweak learning algorithms • Noise