710 likes | 718 Views
This module discusses the use of randomization in complete tree search algorithms, such as randomized strategies in local search, introducing randomness in variable and value selection, and the limitations of local search methods. It also covers topics like heavy-tailed distributions and their use in modeling real-world phenomena.
E N D
CS 4700:Foundations of Artificial Intelligence Carla P. Gomes gomes@cs.cornell.edu Module: Randomization in Complete Tree Search Algorithms Wrap-up of Search!
Randomization in Local Search • Randomized strategies are very successful in the area of local search. • Random Hill Climbing • Simulated annealing • Genetic algorithms • Tabu Search • Gsat and variants. • Key Limitation? Inherent incomplete nature of local search methods.
Randomization in Tree Search • Introduce randomness in a tree search method e.g., by randomlybreaking ties in variable and/or value selection. • Why would we do that? Can we also add a stochastic element to a systematic (tree search) procedure without losing completeness?
Backtrack Search ( aOR NOT b OR NOT c ) AND ( b OR NOT c) AND ( a OR c)
Backtrack Search Two Different Executions ( aOR NOT b OR NOT c ) AND ( b OR NOT c) AND ( a OR c)
The fringe of the search space The fringe of search space
Time: 7 11 30 (*) (*) (*) no solution found - reached cutoff: 2000 Latin Square Completion:Randomized Backtrack Search Easy instance – 15 % pre-assigned cells Gomes et al. 97
2000 500 Erratic Mean Behavior 3500! sample mean Median = 1! number of runs (on the same instance)
Number backtracks Number backtracks 75%<=30 5%>100000 Proportion of cases Solved F(x)
Run Time Distributions • The runtime distributions of some of the instances reveal interesting properties: • I Erratic behavior of mean. • II Distributions have “heavy tails”.
Heavy-Tailed Distributions • … infinite variance … infinite mean • Introduced by Pareto in the 1920’s • --- “probabilistic curiosity.” • Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. • Examples: stock-market, earth-quakes, weather,...
Decay of Distributions • Standard --- Exponential Decay • e.g. Normal: • Heavy-Tailed --- Power Law Decay • e.g. Pareto-Levy:
Levi -Power law Decay Cauchy -Power law Decay Normal - Exponential Decay Normal, Cauchy, and Levy
Normal distribution kurtosis is 3 Fat tailed distribution when kurtosis > 3 (e.g., exponential, lognormal) second central moment (i.e., variance) fourth central moment Fat tailed distributions • Kurtosis =
Fat and Heavy-tailed distributions Exponential decay for standard distributions, e.g. Normal, Logonormal, exponential: Normal Heavy-Tailed Power Law Decay e.g. Pareto-Levy:
Pareto Distribution • where > 0 is a shape parameter • Density Function f(x) = P[ X = x ] • f(x) = / x( + 1) for x 1 • Distribution Function F(x) = P[ X x ] • F(x) = 1 - 1 / x for x 1 • Survival Function (Tail probability S(x) = 1 – F(x) = P[X>x] • S(x) = 1 / x for x 1
Pareto Distribution • Moments E(Xn) = / ( - n) if n < E(Xn) = if n. Mean E(X) = / ( - 1) if > 1. E(X) = if 1. Variance var(X) = / [( - 1)2( - 2)] if > 2 var(X) = if 2.
How to Check for “Heavy Tails”? • Power-law decay of tail • Log-Log plot of tail of distribution (Survival function or 1-F(x): e.g for the Pareto S(x) = 1 / x for x 1 ) • should be approximately linear. • Slope gives value of • infinite mean and infinite variance • infinite variance
Pareto =1Lognormal 1,1 Lognormal(1,1) Pareto(1) f(x) X Infinite mean and infinite variance.
How to Visually Check for Heavy-Tailed Behavior Log-log plot of tail of distribution exhibits linear behavior.
Example of Heavy Tailed Model • Random Walk: • Start at position 0 • Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0.
Long periods without zero crossing Zero crossing The record of 10,000 tosses of an ideal coin (Feller)
50% Random Walk Median=2 2 Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) 1-F(x) Unsolved fraction O,1%>200000 Normal (2,1) X - number of steps the walk takes to return to zero (log scale)
18% unsolved 0.002% unsolved => Infinite mean Heavy-Tailed Behavior in Latin Square Completion Problem (1-F(x))(log) Unsolved fraction Number backtracks (log)
Walsh 99 How Toby Walsh Fried his PC(Graph Coloring)
To Be or Not To Be • Heavy-Tailed
Random Binary CSP Models Model E <N, D, p> N – number of variables; D – size of the domains: p – proportion of forbidden pairs (out of D2N ( N-1)/ 2) N – from 15 to 50; (Achlioptas et al 2000)
Typical Case Analysis: Model E Phase Transition Phenomenon: Discriminating “easy” vs. “hard” instances % of solvable instances Computational Cost (Mean) Constrainedness Hogg et al 96
Backdoors Hidden tractable substructure in real-world problems subset of the “critical” variables such that once assigned a value the instance simplifies to a tractable class practical consequences How to explain short runs? Heavy/Fat Tails – wide range of solution times very short and very long runtimes Formal Models of Heavy and Fat Tails in Combinatorial Search
Aftersetting 5 backdoor vars Aftersetting 12 backdoor vars Initial Constraint Graph Logistics planning problem formula- 843 vars, 7,301 constraints – 16 backdoor variables (visualization by Anand Kapur, 4701 project) Logistics Planning – instances with O(log(n)) backdoors
Algorithms • Three kinds of strategies for dealing with backdoors: Acomplete backtrack-search deterministicalgorithm Acomplete randomized backtrack-searchalgorithm Provably better performance over the deterministic one A heuristicly guided complete randomized backtrack-search algorithm Assumes existence of a good heuristic for choosing variables to branch on We believe this is close to what happens in practice Williams, Gomes, Selman 03/04
x1 = 0 x2 = 0 xn = 0 x1 = 1 x2 = 1 xn = 1 Generalized Iterative Deepening (…) All possible trees of depth 1
x2 = 1 x2 = 0 x2 = 1 x2 = 0 Generalized Iterative Deepening Level 2 x1 = 0 x1 = 1 All possible trees of depth 2
xn = 1 xn = 0 xn= 1 xn = 0 Generalized Iterative Deepening Level 2 xn-1 = 0 Xn-1 = 1 All possible trees of depth 2 Level 3, level 4, and so on …
Randomized Generalized Iterative Deepening Assumption: There exists a backdoor whose size is bounded by a function of n (call it B(n)) Idea: Repeatedly choose random subsets of variables that are slightly larger than B(n), searching these subsets for the backdoor
Det. algorithm outperforms brute-force search for k > 4.2 Deterministic Versus Randomized Suppose variables have 2 possible values (e.g. SAT) For B(n) = n/k, algorithm runtime is cn c Deterministic strategy Randomized strategy k
Complete Randomized Depth First Search with Heuristic • Assume we have the following. • DFS, a generic depth first search randomized • backtrack search solver with: • (polytime) sub-solverA • Heuristic Hthat (randomly) chooses variables to branch on, in polynomial time • Hhas probability 1/h of choosing a • backdoor variable (h is a fixed constant) • Call this ensemble (DFS, H, A)
Polytime Restart Strategy for(DFS, H, A) • Essentially: • If there is a small backdoor, then(DFS, H, A) has a restart strategy that runs in polytime.
Runtime Table for Algorithms DFS,H,A B(n) = upper bound on the size of a backdoor, given n variables When the backdoor is a constant fraction of n, there is an exponential improvement between the randomized and deterministic algorithm Williams, Gomes, Selman 03/04
How to avoid the long runs in practice? Userestarts or parallel / interleavedruns to exploit the extreme variance performance. Restartsprovablyeliminate heavy-tailed behavior.
Restarts 70% unsolved no restarts 1-F(x) Unsolved fraction restart every 4 backtracks 0.001% unsolved 250 (62 restarts) Number backtracks (log)
100000 ~10 restarts ~100 restarts 2000 20 Example of Rapid Restart Speedup(planning) Number backtracks (log) Cutoff (log)