320 likes | 428 Views
Distributions of Randomized Backtrack Search. Key Properties: I Erratic behavior of mean II Distributions have “ heavy tails ”. 2000. 500. Erratic Behavior of Search Cost Quasigroup Completion Problem. 3500!. sample mean. Median = 1!. number of runs. 1. Number backtracks.
E N D
Distributions of Randomized Backtrack Search • Key Properties: • I Erratic behavior of mean • II Distributions have “heavy tails”.
2000 500 Erratic Behavior of Search CostQuasigroup Completion Problem 3500! sample mean Median = 1! number of runs
Number backtracks Number backtracks 75%<=30 5%>100000 Proportion of cases Solved
Heavy-Tailed Distributions • … infinite variance … infinite mean • Introduced by Pareto in the 1920’s • --- “probabilistic curiosity.” • Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. • Examples: stock-market, earth-quakes, weather,...
Decay of Distributions • Standard --- Exponential Decay • e.g. Normal: • Heavy-Tailed --- Power Law Decay • e.g. Pareto-Levy:
Power Law Decay Exponential Decay Standard Distribution (finite mean & variance)
Levy -Power law Decay Cauchy -Power law Decay Normal - Exponential Decay Normal, Cauchy, and Levy
Example of Heavy Tailed Model(Random Walk) • Random Walk: • Start at position 0 • Toss a fair coin: • with each head take a step up (+1) • with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0.
Long periods without zero crossing Zero crossing The record of 10,000 tosses of an ideal coin (Feller)
50% Random Walk Median=2 2 Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) 1-F(x) Unsolved fraction O,1%>200000 Normal (2,1) X - number of steps the walk takes to return to zero (log scale)
How to Check for “Heavy Tails”? • Log-Log plot of tail of distribution • should be approximately linear. • Slope gives value of • infinite mean and infinite variance • infinite variance
18% unsolved 0.002% unsolved => Infinite mean Heavy-Tailed Behavior in QCP Domain (1-F(x))(log) Unsolved fraction Number backtracks (log)
Formal Models of Heavy-Tailed Behavior in Combinatorial Search Chen, Gomes, Selman 2001
Motivation • Research on heavy-tails has been largely based on empirical studies of run time distribution. • Goal: to provide a formal characterization of tree search models and show under what conditions heavy-tailed distributions can arise. • Intuition: Heavy-tailed behavior arises: • from the fact that wrong branching decisions may lead the procedure to explore an exponentially large subtree of the search space that contains no solutions; • the procedure is characterized by a large variability in the time to find a solution on different runs, which leads to highly different trees from run to run;
Balanced vs. Imbalanced Tree Model • Balanced Tree Model: • chronological backtrack search model; • fixed variable ordering; • random child selection with no propagation mechanisms; (show demo)
The run time distribution of chronological backtrack search on a complete balanced tree is uniform (therefore not heavy-tailed). Both the expected run time and variancescale exponentially
Balanced Tree Model • The expected run time and variance scale exponentially, in the height of the search tree (number of variables); • The run time distribution is Uniform, (not heavy tailed ). • Backtrack search on balanced tree model has no restart strategy with exponential polynomial time. Chen, Gomes & Selman 01
How can we improve on the balanced serach tree model? • Very clever search heuristic that leads quickly to the solution node - but that is hard in general; • Combination of pruning, propagation, dynamic variable ordering that prune subtrees that do not contain the solution, allowing for runs that are short. • ---> resulting trees may vary dramatically from run to run.
Formal Model Yielding Heavy-Tailed Behavior • T - the number of leaf nodes visited up to and including the successful node; b - branching factor (show demo) b = 2
Expected Run Time • (infinite expected time) • Variance • (infinite variance) • Tail • (heavy-tailed)
Bounded Heavy-Tailed Behavior (show demo)
Small-World Vs. Heavy-Tailed Behavior • Does a Small-World topology (Watts & Strogatz) induce heavy-tail behavior? The constraint graph of a quasigroup exhibits a small-world topology (Walsh 99)
Exploiting Heavy-Tailed Behavior • Heavy Tailed behavior has been observed in several domains: QCP, Graph Coloring, Planning, Scheduling, Circuit synthesis, Decoding, etc. • Consequence for algorithm design: • Use restarts or parallel / interleaved runs to exploit the extreme variance performance. Restarts provably eliminate heavy-tailed behavior. (Gomes et al. 97, Hoos 99, Horvitz 99, Huberman, Lukose and Hogg 97, Karp et al 96, Luby et al. 93, Rish et al. 97, Wlash 99)
X X X X X 10 10 10 10 10 solved Sequential: 50 +1 = 51 seconds Parallel: 10 machines --- 1 second 51 x speedup Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup
Restarts 70% unsolved no restarts 1-F(x) Unsolved fraction restart every 4 backtracks 0.001% unsolved 250 (62 restarts) Number backtracks (log)
100000 ~10 restarts ~100 restarts 2000 20 Example of Rapid Restart Speedup(planning) Number backtracks (log) Cutoff (log)
Sketch of proof of elimination of heavy tails • Let’struncate the search procedure • after mbacktracks. • Probability of solving problem with truncated version: • Run the truncated procedure and restart it repeatedly.