Artificial Intelligence

Artificial Intelligence Search: 4 Search Heuristics Ian Gent ipg@cs.st-and.ac.uk

Artificial Intelligence Search 4 Part I : Depth first search for SAT Part II: Davis-Putnam Algorithm Part III: Heuristics for SAT

Search: the story so far • Example Search problems, SAT, TSP, Games • Search states, Search trees • Don’t store whole search trees, just the frontier • Depth first, breadth first, iterative deepening • Best First • Heuristics for Eights Puzzle • A*, Branch & Bound

Example Search Problem: SAT • We need to define problems and solutions • Propositional Satisfiability (SAT) • really a logical problem -- I’ll present as a letters game • Problem is a list of words • contains upper and lower case letters (order unimportant) • e.g. ABC, ABc, AbC, Abc, aBC, abC, abc • Solution is choice of upper/lower case letter • one choice per letter • each word to contain at least one of our choices • e.g. AbC is unique solution to above problem.

Example Search Problem: SAT • We need to define problems and solutions • Propositional Satisfiability (SAT) • Now present it as a logical problem • Problem is a list of clauses • contains literals • each literal a positive or negative variable • literals are e.g. +A, -B, +C, …. • Solution is choice of true or false for each variable • one choice per letter • each clause to contain at least one of our choices • I.e. +A matches A = true, -A matches A = false

It’s the same thing • Variables = letters • literal = upper or lower case letter • Positive = True = Upper case • Negative = False = Lower case • clause = word • problem = problem • I reserve the right to use either or both versions confusingly

Depth First Search for SAT • What heuristics should we use? • We need two kinds • variable ordering • e.g. set A before B • value ordering • e.g. set True before False • In Eights, only need value • variable ordering irrelevant • In SAT, variable ordering vital • value ordering less important

Unit Propagation • One heuristic in SAT is vital to success • When we have a unit clause … • e.g. +A • we must set A = true • if we set A = false the clause is unsatisfied, so is the whole problem • A unit clause might be in the original problem • or contain only one unset variable after simplification • e.g. clauses (aBC), (abc), • set A = upper case, B = lower case • what unit clause remains?

Unit Propagation • e.g. clauses (aBC), (abc), • set A = upper case, B = lower case • what unit clause remains? • A = upper gives (BC), (bc) • B = lower case satisfies (bc) • reduces (BC) to (C) • The unit clause is (C) • We should set C = upper case • irrespective of other clauses in the problem • setting one unit clause can create a new one … • leading to a cascade/chain reaction called unit propagation

Depth First + Unit Propagation • Unit propagation is vital in SAT • Whenever there is a not-yet-satisfied unit clause • set the corresponding variable to True if literal positive • false if literal negative • Use this to override all other heuristics • Later in lecture will think about other heuristics to use as well • Next we will look at another algorithm

Davis-Putnam • The best complete algorithm for SAT is Davis-Putnam • first work by Davis-Putnam 1961 • current version by Davis-Logemann-Loveland 1962 • variously called DP/DLL/DPLL or just Davis-Putnam • I will present a slight variant omitting “Pure literal” rule • A recursive algorithm • Two stopping cases • an empty set of clauses is trivially satisfiable • an empty clause is trivially unsatisfiable • there is no way to satisfy the clause

Algorithm DPLL (clauses) • 1. If clauses is empty clause set, Succeed • 2. If clauses contains an empty clause, Fail • 3. If clauses contains a unit clause (literal) • return result of DPLL(clauses[literal]) • clauses[literal] means simplify clauses with value of literal • 4. Else heuristically choose a variable u • heuristically choose a value v • 4.a. If DPLL(clauses[u:=v]) succeeds, Succeed • 4.b. Else return result of DPLL(clauses[u:= not v])

DPLL success • About 40 years old, DPLL is still the most successful complete algorithm for SAT • Intensive research on variants of DPLL in the 90s • mostly very close to the 1962 version • Implementation can be very efficient • Most work on finding good heuristics • Good heuristics should find solution quickly • or work out quickly that there is no solution

It’s the same thing (again) • DPLL is just depth first search + unit propagation • We’ve now got three presentations of the same thing • search trees • algorithm based on lists • DPLL • Shows the general importance of depth first search

Heuristics for DPLL • We need variable ordering heuristics • can easily make the difference between success/failure • Tradeoff between simplicity and effectiveness • Three very simple variable ordering heuristics • lexicographic: choose A before B before C before … • random: choose a random variable • first occurrence: choose first variable in first clause • Pros: all very easy to implement • Cons: ineffective except on very small or easy problems

How can we design better heuristics • All the basic heuristics listed are unlikely to make the best choice except by good luck • We want to choose variables likely to finish search quickly • How can we design heuristics to do this? • Pick variables occurring in lots of clauses? • Prefer short clauses (AB) or long clauses (ABCDEFG) ? • Pick variables occurring more often positively?? • We need some design principles underlying our search

Three Design Principles • The Constrainedness Hypothesis • Choose variables which are more constrained than other variables (e.g. pack suits before toothbrush for interview trip) • Motivation: Most constrained first • attack the most difficult part of the problem • it should either fail or succeed and make the rest easy • The Satisfaction Hypothesis • Try to choose variables which seem likely to come closest to satisfying the problem • Motivation: we want to find a solution, so choose the variable which comes as close to that as possible

Three Design Principles • The simplification hypothesis • Try to choose variables which will simplify the problem as much as possible via unit propagation • Motivation: search is exponential in the size of the problem so making the problem small quickly minimizes search • Let’s look at 3 heuristics based on these principles • not wildly different from each other • often different principles give similar heuristics

Most Constrained First • Short clauses are most constraining • (A B) rules out 1/4 of all solutions • (A B C D E) only rules out 1/32 of all solutions • Take account only of shortest clauses • e.g. shortest clause in a problem may be of length 2 • Several variants on this idea • first occurrence in shortest clause • most occurrences in shortest clauses (usually many such) • first occurrence in all positive shortest clause

Satisfaction Hypothesis • Try to satisfy as much as possible with next literal • Take account of different lengths • clause of length i rules out a fraction 2-i of all solutions • weight each clause by the number 2-i • For each literal, calculate weighted sum • add the weight of each clause the literal appears in • the larger this sum, the more difficulties are eliminated • This is the Jeroslow-Wang Heuristic • Variable and value ordering

Simplification Hypothesis • We want to simplify problem as much as possible • I.e. get biggest possible cascade of unit propagation • One approach is to suck it and see • make an assignment, see how much unit propagation occurs, • after testing all assignments, choose the one which caused the biggest cascade • exhaustive version is expensive (2n probes necessary) • Successful variants probe a small number of promising variables (e.g. from most constrained heuristic)

Conclusions • Unit propagation vital to SAT • Davis Putnam (DP/DLL/DPLL) successful • = depth first + unit propagation • Need heuristics, especially variable ordering • Three design principles help • Not yet clear which is the best • Heuristic design is still a black art

Artificial Intelligence