350 likes | 494 Views
Constraint Programming and Backtracking Search Algorithms. Peter van Beek University of Waterloo. Joint work with: Alejandro López-Ortiz Abid Malik Jim McInnes Claude-Guy Quimper John Tromp Kent Wilken Huayue Wu. Funding: NSERC IBM Canada. Acknowledgements. Outline.
E N D
Constraint Programming and Backtracking Search Algorithms • Peter van Beek • University of Waterloo
Joint work with: Alejandro López-Ortiz Abid Malik Jim McInnes Claude-Guy Quimper John Tromp Kent Wilken Huayue Wu Funding: NSERC IBM Canada Acknowledgements
Outline • Introduction • basic-block scheduling • constraint programming • randomization and restarts • Worst-case performance • bounds on expected runtime • bounds on tail probability • Practical performance • parameterizing universal strategies • estimating optimal parameters • experiments • Conclusions
Basic-block instruction scheduling • Schedule basic-block • straight-line sequence of code with single entry, single exit • Multiple-issue pipelined processors • multiple instructions can begin execution each clock cycle • delay or latencybefore results are available • Find minimum length schedule • Classic problem • lots of attention in literature
A dependency DAG B 3 3 D C 1 3 E Example: evaluate (a + b) + c instructions A r1 a B r2 b C r3 c D r1 r1 + r2 E r1 r1 + r3
A dependency DAG B 3 3 D C 1 3 E Example: evaluate (a + b) + c optimal schedule A r1 a B r2 b C r3 c nop D r1 r1 + r2 E r1 r1 + r3
Constraint programming methodology • Model problem • specify in terms of constraints on acceptable solutions • constraint model: variables, domains, constraints • Solve model • backtracking search • many improvements: constraint propagation, restarts, …
A dependency DAG B 3 3 D C 1 3 E Constraint model variables A, B, C, D, E domains {1, …, m} constraints D A + 3 D B + 3 E C + 3 E D + 1 gcc(A, B, C, D, E, width)
Constraint programming methodology • Model problem • specify in terms of constraints on acceptable solutions • constraint model: variables, domains, constraints • Solve model • backtracking search • many improvements: constraint propagation, restarts, …
A B D C E Solving instances of the model [1,6] [1,6] 1 2 5 6 A 3 3 B [1,6] [1,6] C 1 3 D [1,6] E
Constraint propagation: Bounds consistency domain [1, 6] [1, 6] [1, 6] [1, 6] [1, 6] variable A B C D E [1, 2] [1, 3] [1, 3] [1, 2] [1, 3] [3, 3] [4, 6] [4, 5] [4, 6] [5, 6] [6, 6] constraints D A + 3 D B + 3 E C + 3 E D + 1 gcc(A, B, C, D, E, 1)
A B D C E Solving instances of the model [1,2] [1,2] 1 2 5 6 A 3 3 B [3,3] [4,5] C 1 3 D [6,6] E
A B D C E Solving instances of the model [1,1] [1,2] 1 2 5 6 A 3 3 B [3,3] [4,5] C 1 3 D [6,6] E
A B D C E Solving instances of the model [1,1] [2,2] 1 2 5 6 A 3 3 B [3,3] [5,5] C 1 3 D [6,6] E
Restart strategies • Observation: Backtracking algorithms can be brittle on some instances • small changes to a heuristic can lead to great differences in running time • A technique called randomization and restarts has been proposed to improve performance (Luby et al., 1993; Harvey, 1995; Gomes et al. 1997, 2000) • A restart strategy (t1, t2, t3, …) is a sequence • idea: a randomized backtracking algorithm is run for t1 steps. If no solution is found within that cutoff, the algorithm is restarted and run for t2 steps, and so on
Restart strategies • Let f(t) be the probability a randomized backtracking algorithm A on instance x stops after taking exactly t steps • f(t) is called the runtime distribution of algorithm A on instance x • Given the runtime distribution of an instance, the optimal restart strategy for that instance is given by (t*, t*, t*, …), for some fixed cutoff t*(Luby, Sinclair, Zuckerman, 1993) • A fixed cutoff strategy is an example of a non-universal strategy: designed to work on a particular instance
Universal restart strategies • In contrast to non-universal strategies, universal strategies are designed to be used on any instance • Luby strategy(Luby, Sinclair, Zuckerman, 1993) • Walsh strategy(Walsh, 1999) (1, 1, 2, 1, 1, 2, 4, 1, 1, 2, 1, 1, 2, 4, 8, 1, …) grows linearly (1, r, r2, r3, …), r > 1 grows exponentially
Pitfalls of non-universal restart strategies • Non-universal strategies are open to catastrophic failure • strategy provably will fail on an instance • failure is due to all cutoffs being too small • Non-universal strategies learned by previous proposals can be unbounded worse than performing no restarts at all • pitfall likely to arise whenever some instances are inherently harder to solve than others
Outline • Introduction • basic-block scheduling • constraint programming • randomization and restarts • Worst-case performance • bounds on expected runtime • bounds on tail probability • Practical performance • parameterizing universal strategies • estimating optimal parameters • experiments • Conclusions
Worst-case performance of universal strategies • For universal strategies, two worst-case bounds are of interest: • worst-case bounds on the expected runtime of a strategy • worst-case bounds on the tail probability of a strategy • Luby strategy has been thoroughly characterized (Luby, Sinclair, Zuckerman, 1993) • Walsh strategy has not been characterized
Worst-case bounds on expected runtime • Expected runtime of the Luby strategy is within a log factor of optimal (Luby, Sinclair, Zuckerman, 1993) • We show: Expected runtime of the Walsh strategy (1, r, r2, …), r > 1, can be unbounded worse than optimal
Worst-case bounds on tail probability (I) • Tail probability: Probability an algorithm or a restart strategy runs for more than t steps, for some given t • Tail probability of the Luby strategy decays superpolynomially as a function of t, no matter what the runtime distribution of the original algorithm (Luby, Sinclair, Zuckerman, 1993) P(T > 4000)
Worst-case bounds on tail probability (II) • Pareto heavy-tailed distributions can be a good fit to the runtime distributions of randomized backtracking algorithms (Gomes et al., 1997, 2000) • We show: If the runtime distribution of the original algorithm is Pareto heavy-tailed, the tail probability of the Walsh strategy decays superpolynomially
Outline • Introduction • basic-block scheduling • constraint programming • randomization and restarts • Worst-case performance • bounds on expected runtime • bounds on tail probability • Practical performance • parameterizing universal strategies • estimating optimal parameters • experiments • Conclusions
Practical performance of universal strategies • Previous empirical evaluations have reported that the universal strategies can perform poorly in practice (Gomes et al., 2000; Kautz et al., 2002; Ruan et al. 2002, 2003; Zhan, 2001) • We show: Performance of the universal strategies can be improved by Parameterizing the strategies Estimating the optimal settings for these parameters from a small sample of instances
Motivation • Setting: a sequence of instances are to be solved over time • e.g., in staff rostering, at regular intervals on the calendar a similar problem must be solved • e.g., in instruction scheduling, thousands of instances arise each time a compiler is invoked on some software project • Useful to learn a good portfolio, in an offline manner, from a training set
Parameterizing the universal strategies • Two parameters: • scale s • geometric factor r • Parameterized Luby strategy with, e.g., s = 2, r = 3 • Parameterized Walsh strategy • Advantages: Improve performance while retaining theoretical guarantees (2, 2, 2, 6, 2, 2, 2, 6, 2, 2, 2, 6, 18, …) (s, sr, sr2, sr3, …)
Estimating the optimal parameter settings • Discretize scale s into orders of magnitude, 10-1, …, 105 • Discretize geometric r • 2, 3, …, 10 (Luby) • 1.1, 1.2, ..., 2.0 (Walsh) • Choose values that minimizes performance measure on training set
Experimental setup • Instruction scheduling problems for multiple-issue pipelined processors • hard instances from SPEC 2000 and MediaBench suites • gathered censored runtime distributions (10 minute time limit per instance) • training set: 927 instances • test set: 5,450 instances • Solve using backtracking search algorithm • randomized dynamic variable ordering heuristic • capable of performing three levels of constraint propagation: • Level = 0 Bounds consistency • Level = 1 Singleton consistency using bounds consistency • Level = 2 Singleton consistency using singleton consistency
Experiment 1: Time limit • Time limit: 10 minutes per instance • Performance measure: Number of instances solved • Learn parameter settings from training set, evaluate on test set
Experiment 2: No time limit • No time limit : run to completion • Performance measure: Expected time to solve instances • In our experimental runtime data, replaced timeouts by values sampled from tail of a Pareto distribution • Learn parameter settings from training set, evaluate on test set
Conclusions • Restart strategies • Theoretical performance: worst-case analysis of Walsh universal strategy • Practical performance: approach for learning good universal restart strategies • Bigger picture • Application driven research: Instruction scheduling in compilers • Can now solve optimally almost all instances that arise in practice