400 likes | 548 Views
Understanding Problem Hardness: Recent Developments and Directions Bart Selman Cornell University. Introduction & Motivation. Computational Challenges in Planning, Reasoning, Learning, and Adaptation. What are the characteristics of challenging
E N D
Understanding Problem Hardness: Recent Developments and Directions Bart Selman Cornell University
Introduction & Motivation • Computational Challenges in Planning, Reasoning, • Learning, and Adaptation. • What are the characteristics of challenging • computational problems?
A Few Examples • Reasoning • many forms of deduction • abduction / diagnosis (e.g. de Kleer 1989) • default reasoning (e.g. Kautz and Selman 1989) • Bayesian inference (e.g. Dagum and Luby 1993) • Planning • domain-dependent and independent (STRIPS)(e.g. Chapman 1987; Gupta and Nau 1991; Bylander1994) • Learning • neural net “loading” problem (e.g. Blum and Rivest 1989) • Bayesian net learning • decision tree learning
An abundance of negativecomplexity results for • many interesting tasks. • Results often apply to very restrictedformalisms, • and also to finding approximate solutions. • But worst-case, what about average-case? • Sometimes “surprising” results. • A closer look leads to new insights & • algorithms and solution strategies.
Outline • A --- “Early’’ results: • phase transitions & computational hardness • B --- Current focus: • --- problem mixtures (tractable / intractable) • --- adding global structure • C --- Future directions and prospects • --- modeling resource constraints • --- adaptive computing • --- deeper theoretical understanding
A. “Early” Results • (‘90-’95)
) Example Domain: Satisfiability • SAT: Given a formula in propositional calculus, is there an assignment to its variables making it true? • We consider clausal form, e.g.: • (a b c) ( b d (b c e) . . . • The canonical NP-complete problem. • (“exponential search space”)
Generating Hard Random Formulas • Key: Use fixed-clause-length model. • (Mitchell, Selman, and Levesque 1992; Kirkpatrick and Selman 1994) • Critical parameter: ratio of the number of clauses to the number of variables. • Hardest 3SAT problems at ratio = 4.25
Intuition • At low ratios: • few clauses (constraints) • many assignments • easily found • At high ratios: • many clauses • inconsistencies easily detected
Theoretical Status Of Threshold • Very challenging problem ... • Current status: • 3SAT threshold lies between 3.003 and 4.6. • (Motwani et al. 1994; Broder et al. 1992; • Frieze and Suen 1996; Dubois 1990, 1997; • Kirousis et al. 1995; Friedgut 1997; • Archlioptas et al. 1999 / related work: • Beame, Karp, Pitassi, and Saks 1998; • Bollobas, Borgs, Chayes, Han Kim, and • Wilson 1999)
Phase transition and combinatorial problems is an • active research area with fruitful interactions • between computer science, physics (approaches • from statistical mechanics), and mathematics • (combinatorics / random structures). • Also, a close interaction between experimental and • theoretical work. (With experimental findings quite often • confirmed by formal analysis within months to a few years.) • Finally, relevance to applications via algorithmic • advances and notion of “critically constrained • problems”.
Consequences for Algorithm Design • Phase transition work instances led to • improvements in algorithms: • --- local search methods (e.g., GSAT / Walksat) • (Selman et al. 1992; 1996; Min Li 1996; Hoos 1998, etc.) • --- backtrack-style methods (Davis-Putnam and • variants / complete) • (Crawford 1993; Dubois 1994; Bayardo 1997; Zane 1998, etc.)
Progress • Propositional reasoning and search (SAT): • 1990: 100 variables / 200 clauses (constraints) • 1998: 10,000 - 100,000 variables / 10^6 clauses • Novel applications: • e.g. in planning (Kautz & Selman), • program debugging (Jackson), • protocol verification (Clarke), and • machine learning (Resende).
B. Current Focus • --- mixtures of problem classes, e.g., 2-SAT • and 3-SAT (“moving between P and NP”) • the 2+p-SAT model • --- structured instances • perturbed quasi-group completion problems
Focus --- 1) mixtures: 2+p-SAT problem • mixture of binary and ternary clauses • p = fraction ternary • p = 0.0 --- 2-SAT / p = 1.0 --- 3-SAT • What happens in-between? • (Monasson, Zecchina, Kirkpatrick, Selman, and Troyansky, • Nature, to appear)
Results for 2+p-SAT • p < ~ 0.41 --- model essentially behaves as 2-SAT • search proc. “sees” only binary constraints • smooth, continuous phase transition • p > ~ 0.41 --- behaves as 3-SAT (exponential scaling) • abrupt, discontinuous scaling • Many new, rigorous results (including scaling) by • Achlioptas, Bollobas, Borgs, Chayes, Han Kim, • and Wilson. (Next talk.)
Consequences for Algorithm Design • 1) Strategies that exploit tractable substructure • with propagation are most effective. • (consistent with the best empirically discovered • methods) • 2) In addition, use early branching on critically • constrained variables. • (the “backbone variables” / suggests use of • clustering and statistical learning methods) • (Boyan and Moore 1998)
Focus --- 2) Structure • Proposal: study the influence of global • structure on problem hardness. (Gomes and Selman 1997; 1998)
Quasigroups Defn.:a pair (Q, *) where Q is a set, and * is a binary operation on Q such that a * x = b ; y * a = b areuniquely solvablefor every pair of elements a,b in Q. The multiplication table of its binary operation defines a latin square(i.e., each element of Q appears exactly once in each row/column). Example: Quasigroup of order 4
Quasigroup Completion Problem (QCP) Given a partial latin square, can it be completed? Example:
Quasigroup Completion Problem A Framework for Studying Search • NP-Complete(Colbourn 1983, 1984; Anderson 1985). • Has a regular global structure not found in • random instances. • Leads to interesting search problems when • structure is perturbed. • similar to e.g. structure found in the channel assignment problem • for cellular networks
Consequences for Algorithm Design • On these structured problems, backtrack • search methods show so-called • heavy-tailed probability distributions. • (Gomes, Selman & Crato 1997, 1998). • Both very short and very long runs occur • much more frequent than one would expect.
Algorithmic Strategy: • Rapid Random Restarts. • Order of magnitude speedup. • (Gomes et al. 1998; 1999) • Related: • . Algorithm portfolios (Huberman 1998; Gomes 1998) • . Universal strategies • (Ertel and Luby 1993; Alt et al. 1996)
Portfolio for heavy-tailed search procedures (2-20 processors)
C. Future directions and prospects • Modeling resource constraints & • user requirements / utility • should be possible to identify optimal • restart strategies, possibly adaptive • --- may need way of “measuring progress” • (Horvitz and Klein 1995; Gomes and Selman 1999)
Adaptive Computing • combine statistical learning methods with • combinatorial search techniques. • first success: STAGE system for local search. • (Boyan and Moore 1998) • extension: train a planner on small instances • (Selman, Kautz, Huang 1999) • Deeper theoretical understanding • with continued interactions with experiments • and applications
Summary • During the past few years, we have obtained a much • better understanding of the nature of • computationally hard problems. • Rich interactions between physics, computer • science and mathematics, and between theory, • experiments, and applications. • Clear algorithmic progress with room for future • improvements (possibly another level of scaling: • 10^6 Boolean variables, 10^8 constraints. Further • applications.)