Time-Space Tradeoffs in Resolution: Superpolynomial Lower Bounds for Superlinear Space

Time-Space Tradeoffs in Resolution:Superpolynomial Lower Bounds for Superlinear Space Chris Beck Princeton University Joint work with Paul Beame& Russell Impagliazzo

SAT • The satisfiability problem is a central problem in computer science, in theory and in practice. • Terminology: • A Clause is a boolean formula which is an OR of variables and negations of variables. • A CNF formula is an AND of clauses. • Object of study for this talk: CNF-SAT: Given a CNF formula, is it satisfiable?

Resolution Proof System • Lines are clauses, one simple proof step • Proof is a sequence of clauses each of which is • an original clause or • follows from previous ones via resolution step • a CNF is UNSAT iff can derive empty clause⊥

Proof DAG General resolution: Arbitrary DAG Tree-like resolution: DAG is a tree

SAT Solvers • Well-known connection between Resolution and SAT solvers based on Backtracking • These algorithms are very powerful • sometimes can quickly handle CNF’s with millions of variables. • On UNSAT formulas, computation history yields a Resolution proof. • Tree-like Resolution ≈ DPLL algorithm • General Resolution ≿ DPLL + “Clause Learning” • Best current SAT solvers use this approach

SAT Solvers • DPLL algorithm: backtracking search for satisfying assignment • Given a formula which is not constant, guess a value for one of its variables , and recurse on the simplified formula. • If we don’t find an assignment, set the other way and recurse again. • If one of the recursive calls yields a satisfying assignment, adjoin the good value of and return, otherwise report fail.

SAT Solvers • DPLL requires very little memory • Clause learning adds a new clause to the input CNF every time the search backtracks • Uses lots of memory to try to beat DPLL. • In practice, must use heuristics to guess which clauses are “important” and store only those. Hard to do well! Memory becomes a bottleneck. • Question: Is this inherent? Or can the right heuristics avoid the memory bottleneck?

Proof Complexity & Sat Solvers • Proof Size≤Timefor Ideal SAT Solver • Proof Space≤Memoryfor Ideal SAT Solver • Many explicit hard UNSAT examples known with exponential lower bounds for Resolution Proof Size. • Question: Is this also true for Proof Space?

Space in Resolution • Clause space. [Esteban, Torán ‘99] … Must be in memory Time step

Lower Bounds on Space? • Considering Space alone, tight lower and upper bounds of , for explicit tautologies of size . • Lower Bounds: [ET’99, ABRW’00, T’01, AD’03] • Upper Bound: All UNSAT formulas on 𝑛 variables have tree-like refutation of space ≤ 𝑛. [Esteban, Torán ‘99] • For a tree-like proof, Space is at most the height of the tree which is the stack height of DPLL search • But, these tree-like proofs are typically too large to be practical, so we should instead ask if both small space and time are simultaneously feasible.

Size-Space Tradeoffs for Resolution • [Ben-Sasson ‘01] Pebbling formulas with linear size refutations, but for which all proofs have SpacelogSize Ω(n/log n). • [Ben-Sasson, Nordström ‘10] Pebbling formulas which can be refuted in SizeO(n),SpaceO(n), but SpaceO(n/log n) Size exp(n(1)). But,these are all for Space < 𝑛, and SAT solvers generally can afford to store the input formula in memory. Can we break the linear space barrier?

Size-Space Tradeoffs • Informal Question: Can we formally show that memory rather than time can be a real bottleneck for resolution proofs and SAT solvers? • Formal Question (Ben-Sasson): “Does there exist a 𝑐 such that any CNF with a refutation of sizeTalso has a refutation of sizeT𝑐in spaceO(𝑛)?” • Our results: Families of formulas of size n having refutations in Time, Spacenlog n, but all resolution refutations have T > (n0.58 log n/S)loglog n/logloglog n Not even close

Tseitin Tautologies Given an undirected graph and :𝑉→𝔽2 , define a CSP: Boolean variables: Parity constraints:(linear equations) When has odd total parity, CSP is UNSAT.

Tseitin Tautologies • When  odd, G connected, corresponding CNF is called a Tseitin tautology. [Tseitin ‘68] • Only total parity of  matters • Hard when G is a constant degree expander: [Urqhart 87]: Resolution size =2Ω(𝐸)[Torán 99]: Resolution space =Ω(E) • This work: Tradeoffs on 𝒏 × 𝒍grid, 𝒍 ≫ 𝒏, and similar graphs, using isoperimetry.

Complexity Measure • To measure progress as it occurs in the proof, want to define a complexity measure, which assigns a value to each clause in the proof. • Wish list: • Input clauses have small value • Final clause has large value • Doesn’t grow quickly in any one resolution step

Complexity Measure for Tseitin • Say an assignment to an (unsat) CNF is a critical if it violates only one constraint. • For 𝜋 critical to Tseitin formula 𝜏, “’s vertex”: • For any Clause 𝐶, define the “critical vertex set”: 𝑣𝜏(𝜋)≔ vertex of that constraint 𝑐𝑟𝑖𝑡𝜏(𝐶) ≔{ 𝑣𝜏(𝜋) : 𝜋 critical to 𝜏, falsifies 𝐶}

Critical Set Examples Blue = 0Red = 1 𝒏 𝒍  function: one odd vertex in corner For the empty assignment, Critical set is everything. In these examples, Graph is a Grid.

Critical Set Examples Blue = 0Red = 1 𝒏 𝒍 For a clause that doesn’t cut the graph, Critical set is … still everything.

Critical Set Examples Blue = 0Red = 1 𝒏 𝒍 For this clause, several components. Parity mismatch in only one, Upper left is critical.

Critical Set Examples Blue = 0Red = 1 𝒏 𝒍

Complexity Measure • Define 𝜇(𝐶) ≔ |𝑐𝑟𝑖𝑡𝜏(𝐶)|. Then 𝜇 is a sub-additive complexity measure: • 𝜇(clause of 𝜏) = 1, • 𝜇(⊥) = # 𝑣𝑒𝑟𝑡𝑖𝑐𝑒𝑠, • 𝜇(𝐶) ≤ 𝜇(𝐶1 ) + 𝜇(𝐶2), when 𝐶1, 𝐶2⊦ 𝐶. • Useful property: Every edge on the boundary of 𝑐𝑟𝑖𝑡𝜏(𝐶) is assigned by 𝐶.

Graph for our result • Take as our graph 𝐺≔𝐾𝑛⊗𝑃𝑙and form the Tseitin tautology, 𝜏. • For now we’ll take 𝑙 = 2𝑛4,but it’s only important that it is≫ 𝑛, and not too large. • Formula size . n l

A Refutation • A Tseitin formula can be viewed as a system of inconsistent 𝔽2-linear equations. If we add them all together, get 1=0, contradiction. • If we order the vertices (equations) intelligently (column-wise), then when we add them one at a time, never havemore than 𝑛2variables at once • Resolution can simulate this, with some blowup: • yields Size, Space ≈2𝑛2 (for a sized formula)

A Different Refutation • Can also do a “divide & conquer” DPLL refutation. Idea is to repeatedly bisect the graph and branch on the variables in the cut. • Once we split the CNF into two pieces, can discard one of them based on parity of assignment to cut. • After queries, we’ve found a violated clause – idea yields tree-like proof with Space, Size . (Savitch-like savings in Space)

Complexity vs. Time • Consider the time ordering of any proof, and plot complexity of clauses in memory v. time • Only constraints – start low, end high, and because of sub-additivity, cannot skip over any [t, 2t] window of μ-values on the way up. ⊥ μ Time Input clauses

Medium complexity clauses • Fix 𝑡0 = 𝑛4 and say that clause 𝐶 has complexity level𝑖 iff2𝑖 𝑡0≤ 𝜇(𝐶) <2𝑖+1 𝑡0 • Medium complexity clause: complexity level between 0 and log⁡ 𝑛 -1 • By choice of parameters 𝑛4  𝜇(𝐶) < |𝑉|/2 • Subadditivity can’t skip any level going up.

Complexity vs. Time • Consider the time ordering of any proof, and divide time into 𝑚equal epochs (fix 𝑚 later) Hi Med Low Time

Two Possibilities • Consider the time ordering of any proof, and divide time into 𝑚 equal epochs (fix 𝑚 later) • Either, a clause of medium complexity appears in memory for at least one of the breakpoints between epochs, Hi Med Low Time

Two Possibilities • Consider the time ordering of any proof, and divide time into 𝑚 equal epochs (fix 𝑚 later) • Or, all breakpoints only have Hi and Low. Must have an epoch which starts Low ends Hi, and so has clauses of all log⁡ 𝑛 Medium levels. Hi Med Low Time

Don’t know how to use this directly • Not much to go on with formula for original graph • However, a Tseitin formula for graph G contains Tseitin formulas for all subgraphs of G • Just set edge variables to 0,1 to delete them. • Refutation for G contains refutations for all subgraphs of G

Idea: Random Restrictions • A restriction 𝜎is a partial assignment of truth values to variables, simplifying formulas. • 𝐹 a CNF, yields restricted formula 𝐹|𝜌 • Πa proof of 𝐹, yields restricted refutation Π|𝜌of 𝐹|𝜌 -size, space don’t increase. • Idea: Choose ρ randomly, and see which of the two possibilities happens in restricted proof.

Medium Sets have large Boundary Claim: For any .

Medium Sets have large Boundary Claim: For any . Lemma: If we set each edge to 0,1 or leave unset each with prob=1/3 then w.h.p. G stays connected and probability a clause becomes medium complexity is at most exp⁡(−𝑛2)

Intuition • Time divided into 𝑚epochs • If we apply a random restriction, and 𝑚𝑆 is smaller than , then typically first scenario will not occur in restricted proof. Hi Med Low Time

Intuition • Time divided into 𝑚 epochs • If (𝑇⁄𝑚) is also small, we show that this 2nd scenario is also unlikely in the restricted proof... Hi Med Low Time

Extended Isoperimetric Inequality Lemma:For𝑆1, 𝑆2⊆ 𝑉, both medium, 2|𝑆1|<|𝑆2|,have |𝛿(𝑆1) ∪ 𝛿(𝑆2)| ≥ 2𝑛2. Two medium sets of very different sizes  boundary guarantee doubles. Also: 𝑘 medium sets of super-increasing sizes boundary guarantee goes up by factor of 𝑘

Consequence of Extended Isoperimetric Inequality • General Lemma:If we set each edge to 0,1 or leave unset each with prob=1/3,then for any fixed set of 𝑘clauses, probability to end up with k distinctmedium complexity levels is at most exp⁡(−𝑘𝑛2)

A tradeoff • Back to the second scenario: Now set 𝑘 = log 𝑛. At most (𝑇⁄𝑚)𝑘𝑘-tuples of clauses from any epoch, and all distinct complexity with probability . • So if , w.h.p. no epoch has levels. • Both scenarios can’t be unlikely! Optimizing gives and nontrivial but small tradeoff. Hi Med Low Time

An even better tradeoff • Don’t just divide into epochs once • Recursively divide proof into epochs and sub-epochs where each sub-epoch contains 𝑚 sub-epochs of the next smaller size • Prove that if a lot of progress happens in some epoch, and the breakpoints of its sub-epochs don’t contain many different levels, then a lot of progress happens in some sub-epoch

An even better tradeoff • If an epoch contains a clause at the end of level , but every clause at start is level , (so the epoch makes progress), • and the breakpoints of its children epochs contain together complexity levels, • then some child epoch makes progress. a

An even better tradeoff • Depth of recursion will be k, choose k so that • If for each epoch the breakpoints have only levels, prev. slide shows some i’th level epoch makes progress, for each depth i. • In particular some leaf, of size , contains levels. Choose so , so all events are rare together.Finally, a union bound yields

Final Tradeoff • Tseitin formulas on 𝐾𝑛⊗𝑃𝑙for • are of size • have resolution refutations in Size, Space ≈2𝑛2 • have O(𝑛2 log 𝑛) space refutations of Size 2𝑛2log 𝑛 • and any resolution refutation for Size 𝑇 and Space 𝑆 requires 𝑇> (2 0.58 𝑛2 /𝑆)loglog𝑛 /logloglog𝑛 If space is at most 2𝑛2/2 then size blows up by a super-polynomial amount

Proof DAG

Proof DAG “Regular”:On every root to leaf path, no variable resolved more than once.

Tradeoffs for Regular Resolution • Theorem : For any k, 4-CNF formulas (Tseitin formulas on long and skinny grid graphs) of size nwith • Regular resolution refutationsin sizenk+1, Spacenk. • But with Space only nk-, for any  > 0, any regular resolution refutation requires size at least n log log n / log loglogn.

Regular Resolution • Can define partial information more precisely • Complexity is monotonic wrt proof DAG edges. This part uses regularity assumption, simplifies arguments with complexity plot. • Random Adversary selects random assignments based on proof • No random restrictions, conceptually clean and don’t lose constant factors here and there.

Open Questions • More than quasi-polynomial separations? • For Tseitin formulas upper bound for small space is only a log npower of the unrestricted size • Candidate formulas? • Are these even possible? • Other proof systems? • Other cases for separating search paradigms:“dynamic programming” vs. “binary search”?

Thanks!

Time-Space Tradeoffs in Resolution: Superpolynomial Lower Bounds for Superlinear Space

Time-Space Tradeoffs in Resolution: Superpolynomial Lower Bounds for Superlinear Space

Presentation Transcript

Relativity

Lunar Plant Growth Chamber

Space Elevator

A Subspace Method for MIMO Radar Space-Time Adaptive Processing

Care of the Client with Chest Tubes

Methods of gaining space in the permanent dentition Types of clinical cases in Orthodontics: 1 Definite extraction cases

Chapter: Exploring Space

Colorado Space Grant Consortium

Access Control

Exploring Space

Space

Sculpture and Site Specific Art

CONFINED SPACE

Understanding Seismic Events

The City in Space and Time

THE SPACE TRAVEL AND THE SPACE SHUTTLE

EARTH IN SPACE

Storefronts

Time Space and Time-Space

Understanding Seismic Events

Slides before 1st Section Divider

Pictures of the year: Space