Modeling Boolean Formula Difficulty and Unsatifiable Core Prediction

Fitting a Function to theDifficulty of Boolean Formulas Greg Dennis NMM Final Project

Motivation • Difficulty of boolean formulas varies greatly • difficulty = # decisions by solver = time to solve • difficulty varies between formulas of same size • Difficulty depends on many factors • size of formula, clause : variable ratio, algorithm, luck • What factors influence the difficulty and to what degree? • Can we fit a function to the difficulty using these features?

variables literal = (negated) variable clause = disjuntion of literals CNF and 3-SAT • Conjunctive Normal Form (CNF): • k-SAT = CNF with exactly k literals / clause • no two literals in clause have same variable • no two identical clauses • k ≥ 3 is NP-complete (e.g. 3-SAT) • clausal density = # clauses / # variables CNF = conjunction of clauses

Phase Transition Scatter Plot

Phase Transition Curve

Unsatisfiable Core • subset of the clauses that is unsatisfiable • "proof" or "reason" for unsatisfiability • very hard to obtain a minimal core • ZChaff SAT solver iterative technique empirically close to minimal • use unsat core as feature in function fit • larger the core  more clauses solver visits

What I Did • examined only unsatisfiable 3-SAT formulas • generated 2550 random unsat 3-SAT • ran BerkMin SAT solver on each • ran ZChaff unsat core technique on each • recorded clauses, vars, unsat core, time • fit data with Gaussian kernel

time vs density

time vs unsat core

Bad Results . . . biggest outliers when unsat core = clauses

Better Results . . . leaving out all formulas with 100% unsatisfiable core biggest outliers when unsat core = clauses - 1

Observations • difficulty extremely volatile even with fixed clauses, vars, and unsat core • especially volatile at the phase transition • unsat core helps explain some difficulty, but does not tell the whole story

Questions Remain • curve not useful to predict the difficulty • takes longer to find unsat core than to solve • Q: could we predict unsat core if we already have the difficulty? • most applications don't generate random CNF's • Q: how well does the function predict the behavior of non-random formulas?

Another Experiment • performed regression again, but with time as a feature and unsat core as the value • obtained 10 CNFs generated by The Alloy Analyzer and converted them to 3-SAT • predicted the percentage of clauses in the unsat core and compared to actual number

Unsat Core Predictions very hard to predict . . .

For Future Students • lots of engineering completed • generation of random CNFs • read/write of CNFs to appropriate file format • interface with command line SAT solver • implement fix point technique • conversion to 3-SAT • regression with kernel • code, write-up, and this presentation available on my NMM page

Modeling Boolean Formula Difficulty and Unsatifiable Core Prediction

Modeling Boolean Formula Difficulty and Unsatifiable Core Prediction

Presentation Transcript

Fourier Analysis and Boolean Function Learning

Conditional Formulas using the IF Function

The Difficulty of U.S. Neutrality

Exclusive and essential sets of implicates of a Boolean function

Function Formulas

Computing Unsat Cores Of Boolean And SMT Formulas

Computing the Density of States of Boolean Formulas

Read-Once Functions (Revisited) and the Readability Number of a Boolean Function

Comparison of different VMI fitting formulas/procedures

FUNCTION FITTING

Parton Density Function Fitting Update

Cosmic Variance and Luminosity Function Fitting

Fitting a Line to a Set of Points

The Difficulty of Defining Drawing

A Global and Local Function Fitting of a Temperature Profile

Fitting a Line to Data

To find the inverse of a function:

BOOLEAN FUNCTION PROPERTIES

Boolean Function

Evaluating Quantified Boolean Formulas

Computing the Density of States of Boolean Formulas

what is a void-function? what is a boolean function?