320 likes | 475 Views
Hierarchical Hardness Models for SAT. Lin Xu, Holger H. Hoos and Kevin Leyton-Brown University of British Columbia {xulin730, hoos, kevinlb}@cs.ubc.ca. One Application of Hierarchical Hardness Models --- SATzilla-07. Old SATzilla [ Nudelman, Devkar, et. al, 2003 ] 2nd Random
E N D
Hierarchical Hardness Models for SAT Lin Xu, Holger H. Hoos and Kevin Leyton-Brown University of British Columbia {xulin730, hoos, kevinlb}@cs.ubc.ca
One Application of Hierarchical Hardness Models --- SATzilla-07 • Old SATzilla [Nudelman, Devkar, et. al, 2003 ] • 2nd Random • 2nd Handmade (SAT) • 3rd Handmade • SATzilla-07 [Xu, et.al, 2007] • 1st Handmade • 1st Handmade (UNSAT) • 1st Random • 2nd Handmade (SAT) • 3rd Random (UNSAT) 2
Outline • Introduction • Motivation • Predicting the Satisfiability of SAT Instances • Hierarchical Hardness Models • Conclusions and Futures Work
Introduction • NP-hardness and solving big problems • Using empirical hardness model to predict algorithm’s runtime • Identification of factors underlying algorithm’s performance • Induce hard problem distribution • Automated algorithm tuning [Hutter, et al. 2006] • Automatic algorithm selection [Xu, et al. 2007] • …
Empirical Hardness Model (EHM) • Predicted algorithm’s runtime based on poly-time computable features • Features: anything can characterize the problem instance and can be presented by a real number • 9 category features [Nudelman, et al, 2004] • Prediction: any machine learning technique can return prediction of a continuous value • Linear basis function regression [Leyton-Brown et al, 2002; Nudelman, et al, 2004]
fw() = wT • 23.34 • 7.21 • … • … Features () Runtime (y) Linear Basis Function Regression
Moracular: always use good model Muncond.: Trained on mixture of instance Motivation • If we train EHM for SAT/UNSAT separately (Msat, Munsat), • Models are more accurate and much simpler [Nudelman, et al, 2004] Solver: satelite; Dataset: Quasi-group completion problem
Msat: only use SAT model Munsat: only use UNSAT model Motivation • If we train EHM for SAT/UNSAT separately (Msat, Munsat), • Models are more accurate and much simpler • Using wrong model can cause huge error Solver: satelite; Dataset: Quasi-group completion problem
Attempt to guess Satisfiability of given instance! Motivation Question: How to select the right model for a given problem?
Predict Satisfiability of SAT Instances • NP-complete No poly-time method has 100% accuracy of classification • BUT • For many types of SAT instances, classification with poly-time features is possible • Classification accuracies are very high
Performance of Classification • Classifier: SMLR • Features: same as regression • Datasets • rand3sat-var • rand3sat-fix • QCP • SW-GCP [Krishnapuram, et al. 2005]
Performance of Classification For all datasets: • Much higher accuracies than random guessing • Classifier output correlates with classification accuracies • High accuracies with few features 16
Back to Question Question: How to select the right model for a given problem? • Can we just use the output of classifier? NO, need to consider the error distribution • Note: best performance would be achieved by model selection Oracle! Question: How to approximate model selection oracle based on features 18
Hierarchical Hardness Models Z Model selection Oracle Mixture of experts problem with FIXED experts, use EM to find the parameters for z [Murphy, 2001] , s Features & Classifier output s: P(sat) y Runtime 19
A B Importance of Classifier’s Output Two ways: A: Using classifier’s output as a feature B: Using classifier’s output for EM initialization A+B
Example for rand3-var (satz) Left: unconditional model Right: hierarchical model
Example for rand3-fix (satz) Left: unconditional model Right: hierarchical model
Example for QCP (satelite) Left: unconditional model Right: hierarchical model
Example for SW-GCP (zchaff) Left: unconditional model Right: hierarchical model
Correlation Between Prediction Error and Classifier’s Confidence Right: Classifier’s output vs RMSE Left: Classifier’s output vs runtime prediction error Solver: satelite; Dataset: QCP
Conclusions • Models conditioned on SAT/UNSAT have much better runtime prediction accuracies. • A classifier can be used to distinguish SAT/UNSAT with high accuracy. • Conditional models can be combined into hierarchical model with better performance. • Classifier’s confidence correlates with prediction error. 28
Future Work • Better features for SW-GCP • Test on more real world problems • Extend underlying experts beyond satisfiability