SATzilla: Portfolio-based Algorithm Selection for SAT

SATzilla: Portfolio-based Algorithm Selection for SAT Lin Xu, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown Department of Computer Science University of British Columbia Mainly based on [Xu, Hutter, Hoos, Leyton-Brown, JAIR, 2008]

SATzilla • Excellent performance in SAT competitions [http://www.satcompetition.org/] • 2 second places, one third place in 2003 SAT competition [Leyton-Brown, Nudelman, Andrew, McFadden & Shoham, 2003] • 3 Gold, 1 Silver, 1 Bronze in 2007 SAT competition [Xu, Hutter, Hoos, & Leyton-Brown, 2007] • 3 Gold, 2 Silver in 2009 SAT competition [Xu, Hutter, Hoos & Leyton-Brown, 2009] • One gold in every category: Industrial/Handmade/Random • Domain independent • MIPzilla, CSP, Planning … Feature extractor Algorithm selector Instance Solvers Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Impact on the SAT community Encouraged research on portfolio-based solvers for SAT Final 2011 SAT competition results [http://www.satcompetition.org/] Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Outline • Introduction • Related Work: • Algorithm selection • SATzilla overview: • Building SATzilla • Running SATzilla • SATzilla07 and beyond • Results • SATzilla since 2008 • Conclusion Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Related work: algorithm selection Select the best solver for each instance [Rice, 76] • Regression-based • Linear regression (SATzilla) [Leyton-Brown, Nudelman, Andrew, McFadden & Shoham, 2003; Xu, Hutter, Hoos, Leyton-Brown, 2007+] • Matchbox [Stern et al., 2010] • Classification-based • Logistic regression [Samulowitz & Memisevic, 2007] • K-NN [O’Mahony & Hebrard, 2008] • Decision tree [Guerri & Milano, 2004] Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

What is SATzilla? • Portfolio-based algorithm selection based on empirical hardness models [Leyton-Brown et al., 2002;2009] • Use regression models to predict each algorithm’s performance • Select the predicted best • Many new techniques for improving robustness • Pre-solvers • Backup solver • Solver subset selection Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

SATzilla: prerequisites Identify a target distribution of instances Select a set of candidate solvers, and measure runtime on a training set Identify a set of features based on domain knowledge and collect feature values • Cheap to compute • Informative for prediction Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

SATzilla core: predictive models Goal:accurately predict solver performance based on cheaply computable features Supervised machine learning: learn function f: features → performance • Input: <features, performance> pairs • Output: f fw() = wT Algorithm selector • 23.34 • 7.21 • … • … Features () Performance(y) Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

SATzilla: selection based on models • Predict each solver’s performance • Pick solver based on predictions Instance Instance Feature extractor Algorithm selector Run on selected solver Compute features Make prediction Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Improving robustness Feature extractor Algorithm selector Algorithm selector • Solver subset selection • More candidate solvers can hurt • select subset of solvers that achieves best overall performance • Pre-solvers • Solve some instances quickly without need for feature computation • Increase model robustness • Backup solver • In case something goes wrong Dealing with imperfect predictions: Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Time to run SATzilla! Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Step 1: pre-solving Run pre-solvers for a fixed short amount of time • If solved, done Feature extractor Algorithm selector Instance Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Step 2: compute features Feature extractor Algorithm selector Instance Run feature extractor to get a set of feature values • If solved, done • If anything fails, simply run backup solver Xu, Hutter, Hoos, and Leyton-Brown: SATzilla 13

Step 3: performance prediction Feature extractor Algorithm selector Predict performance of solvers and order them according to runtime predictions Xu, Hutter, Hoos, and Leyton-Brown: SATzilla 14

Step 4: run top-ranked solver Feature extractor Algorithm selector Instance Run the top-ranked solver. If a run fails, run the next-ranked solver. Xu, Hutter, Hoos, and Leyton-Brown: SATzilla 15

SATzilla07: construction Pre-solvers: manually selected Backup solver: overall best Model building: • Predict runtime • Hierarchical hardness models [Xu, Hoos, Leyton-Brown, 2007] • Models handle “censored data” [Schmee & Hahn, 1979] Solver subset selection: exhaustive search Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Feature extractor Algorithm selector Improving SATzilla07 (1.) Problem:manual pre-solver selection is not ideal Solution: auto pre-solver selection based on validation Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Improving SATzilla07 (2.-5.) Problem: overall best solver best backup solver Solution: choose the best solver for instances not solved by pre-solvers and with problematic features Problem: objective is not always runtime Solution: predict and optimize true objective directly also enables cleaner use of incomplete solvers Problem: exhaustive search for solver subset selection is infeasible for large portfolios Solution: local search for solver subset selection Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

SATzilla vs candidate solvers HANDMADE Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

SATzilla vs candidate solvers Oracle HANDMADE Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

SATzilla vs candidate solvers Oracle HANDMADE SATzilla07 Feat. Comp. Pre-solving Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

SATzilla vs candidate solvers SATzilla07 (improved) Oracle HANDMADE SATzilla07 Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Feature extractor Algorithm selector SATzilla09 [Xu et al., 2009] New techniques • New features: clause learning, … • Prediction of feature computation cost Instance Feature time predictor Minimal cost feature extractor • Results:3 Gold, 2 Silver in 2009 SAT competition • - Gold for Industrial (SAT) category • - Gold for Handmade (UNSAT) category • - Gold for Random (SAT+UNSAT) category Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

New Results: Cost sensitive classification Goal: directly optimize the objective [Xu et al., 2011] • Cost sensitive classification for every pair of candidates • Vote to decide the order of candidates Improved performance over regression models, particularly with heterogeneous data sets New SATzilla on SAT competition 2011 Random Handmade Industrial Oracle 33% Gap closed: 80% 55% Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Recent work: Combining algorithm selection and algorithm configuration Problem: what if you only have one strong solver but it ishighlyparameterized? “Hydra”: combine algorithm selection and algorithm configuration [Xu et al., 2010; 2011] • Automatically find strong, uncorrelated parameter settings • Construct portfolio from these Results: clear improvement for both SAT and MIP • 2 × to 31 × faster for MIP • 1.2 × to 73 × faster for SAT Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

Conclusion • Portfolio-based algorithm selection can improve performance and robustness • SATzilla-like portfolio approaches are the state of the art for many problems • New directions since our JAIR paper: • better predictive models • automatic portfolio construction (“Hydra”) Xu, Hutter, Hoos, and Leyton-Brown: SATzilla

SATzilla: Portfolio-based Algorithm Selection for SAT