Hierarchical Hardness Models for SAT

Hierarchical Hardness Models for SAT Lin Xu, Holger H. Hoos and Kevin Leyton-Brown University of British Columbia {xulin730, hoos, kevinlb}@cs.ubc.ca

One Application of Hierarchical Hardness Models --- SATzilla-07 • Old SATzilla [Nudelman, Devkar, et. al, 2003 ] • 2nd Random • 2nd Handmade (SAT) • 3rd Handmade • SATzilla-07 [Xu, et.al, 2007] • 1st Handmade • 1st Handmade (UNSAT) • 1st Random • 2nd Handmade (SAT) • 3rd Random (UNSAT) 2

Outline • Introduction • Motivation • Predicting the Satisfiability of SAT Instances • Hierarchical Hardness Models • Conclusions and Futures Work

Introduction

Introduction • NP-hardness and solving big problems • Using empirical hardness model to predict algorithm’s runtime • Identification of factors underlying algorithm’s performance • Induce hard problem distribution • Automated algorithm tuning [Hutter, et al. 2006] • Automatic algorithm selection [Xu, et al. 2007] • …

Empirical Hardness Model (EHM) • Predicted algorithm’s runtime based on poly-time computable features • Features: anything can characterize the problem instance and can be presented by a real number • 9 category features [Nudelman, et al, 2004] • Prediction: any machine learning technique can return prediction of a continuous value • Linear basis function regression [Leyton-Brown et al, 2002; Nudelman, et al, 2004]

fw() = wT • 23.34 • 7.21 • … • … Features () Runtime (y) Linear Basis Function Regression

Motivation

Moracular: always use good model Muncond.: Trained on mixture of instance Motivation • If we train EHM for SAT/UNSAT separately (Msat, Munsat), • Models are more accurate and much simpler [Nudelman, et al, 2004] Solver: satelite; Dataset: Quasi-group completion problem

Msat: only use SAT model Munsat: only use UNSAT model Motivation • If we train EHM for SAT/UNSAT separately (Msat, Munsat), • Models are more accurate and much simpler • Using wrong model can cause huge error Solver: satelite; Dataset: Quasi-group completion problem

Attempt to guess Satisfiability of given instance! Motivation Question: How to select the right model for a given problem?

Predicting Satisfiability of SAT Instances

Predict Satisfiability of SAT Instances • NP-complete No poly-time method has 100% accuracy of classification • BUT • For many types of SAT instances, classification with poly-time features is possible • Classification accuracies are very high

Performance of Classification • Classifier: SMLR • Features: same as regression • Datasets • rand3sat-var • rand3sat-fix • QCP • SW-GCP [Krishnapuram, et al. 2005]

Performance of Classification (QCP) 15

Performance of Classification For all datasets: • Much higher accuracies than random guessing • Classifier output correlates with classification accuracies • High accuracies with few features 16

Hierarchical Hardness Models

Back to Question Question: How to select the right model for a given problem? • Can we just use the output of classifier? NO, need to consider the error distribution • Note: best performance would be achieved by model selection Oracle! Question: How to approximate model selection oracle based on features 18

Hierarchical Hardness Models Z Model selection Oracle Mixture of experts problem with FIXED experts, use EM to find the parameters for z [Murphy, 2001] , s Features & Classifier output s: P(sat) y Runtime 19

A B Importance of Classifier’s Output Two ways: A: Using classifier’s output as a feature B: Using classifier’s output for EM initialization A+B

Big Picture of HHM Performance

Example for rand3-var (satz) Left: unconditional model Right: hierarchical model

Example for rand3-fix (satz) Left: unconditional model Right: hierarchical model

Example for QCP (satelite) Left: unconditional model Right: hierarchical model

Example for SW-GCP (zchaff) Left: unconditional model Right: hierarchical model

Correlation Between Prediction Error and Classifier’s Confidence Right: Classifier’s output vs RMSE Left: Classifier’s output vs runtime prediction error Solver: satelite; Dataset: QCP

Conclusions and Futures Work

Conclusions • Models conditioned on SAT/UNSAT have much better runtime prediction accuracies. • A classifier can be used to distinguish SAT/UNSAT with high accuracy. • Conditional models can be combined into hierarchical model with better performance. • Classifier’s confidence correlates with prediction error. 28

Future Work • Better features for SW-GCP • Test on more real world problems • Extend underlying experts beyond satisfiability

Hierarchical Hardness Models for SAT