Pattern Databases

Pattern Databases Robert Holte University of Alberta November 6, 2002

Pattern Database Successes (1) Joe Culberson & Jonathan Schaeffer (1994). • 15-puzzle (1013 states). • 2 hand-crafted patterns (“fringe” (FR) and “corner” (CO)) • Each PDB contains over 500 million entries (< 109 abstract states). • Used symmetries to compress and enhance the use of PDBs • Used in conjunction with Manhattan Distance (MD) Reduction in size of search tree: • MD = 346 * max(MD,FR) • MD = 437 * max(MD,CO) • MD = 1038 * max(MD, interleave(FR,CO))

Pattern Database Successes (2) Rich Korf (1997) • Rubik’s Cube (1019 states). • 3 hand-crafted patterns, all used together (max) • Each PDB contains over 42 million entries • took 1 hour to build all the PDBs Results: • First time random instances had been solved optimally • Hardest (solution length 18) took 17 days • Best known MD-like heuristic would have taken a century

Pattern Database Successes (3) Stefan Edelkamp (2001) • Planning benchmarks: e.g. logistics, Blocks world • Automatically generated PDBs (not domain abstraction) • Additive pattern databases (in some cases) Results: • PDB competitive with the best planners • logistics domain (weighted A*), PDB run-time 100 times smaller than FF heuristic

Pattern Database Successes (4) Istvan Hernadvolgyi (2001) • Macro-operators are concatenated to very quickly construct suboptimal solutions • For Rubik’s Cube hundreds of macro-operators are needed • Each macro is found by searching in the Rubik’s Cube state space with a macro-specific “subgoal” and start state • For every one of these searches, a PDB was generated automatically (domain abstraction) so that an optimal-length macro could be found quickly Results: • Optimal-length macros for all subgoals found for the first time • So quick that it permitted subgoals to be merged • This shortened solutions from 90 moves to 50 (optimal is ~18)

Fundamental Questions How to invent effective heuristics ? Create a simplified version of your problem. Use the exact distances in the simplified version as heuristic estimates in the original. How to use memory to speed up search ? Precompute all distances-to-goal in the simplified version of the problem and store them in a lookup table (pattern database).

Example: 8-puzzle 181,440 states Domain = blank 1 2 3 4 5 6 7 8

original state corresponding pattern Domain = blank 1 2 3 4 5 6 7 8 Abstract = blank “Patterns”created by domain mapping This mapping produces 9 patterns

Pattern Database Pattern Distance to goal 0 1 1 2 2 2 Pattern Distance to goal 3 3 4

Calculating h(s) Given a state in the original problem Compute the corresponding pattern and look up the abstract distance-to-goal 2 Heuristics defined by PDBs are consistent, not just admissible.

Abstract Space

Efficiency Time for the preprocessing to create a PDB is usually negligible compared to the time to solve one problem-instance with no heuristic. Memory is the limiting factor.

Domain = blank 1 2 3 4 5 6 7 8 Abstract = blank 6 7 8 “Pattern” = leave some tiles unique 3024 patterns

Domain Abstraction Domain = blank 1 2 3 4 5 6 7 8 Abstract = blank 6 7 8 30,240 patterns

8-puzzle PDB sizes(with the blank left unique) 9 72 252 504 630 1512 2520 3024 3780 5040 7560 10080 15120 15120 22680 30240 45360 60480 90720 181440

Domain = blank 1 2 3 4 5 6 7 8 Abstract = blank Automatic Creation of Domain Abstractions • Easy to enumerate all possible domain abstractions • They form a lattice, e.g. is “more abstract” than the domain abstraction above Domain = blank 1 2 3 4 5 6 7 8 Abstract = blank

Problem: Non-surjectivity

Problem: Non-surjectivity Domain = blank 1 2 Abstract = blank 1 blank

Problem: Non-surjectivity ?? Domain = blank 1 2 Abstract = blank 1 blank

Pattern Database Experiments Aim: To understand how search performance using PDBs is related to easily measurable characteristics of the PDBs e.g. size, average value Basic Method: • Choose a variety of state spaces. • For each state space generate thousands of PDBs. • For each PDB, measure its characteristics and the performance of A* (IDA* etc.)using it.

# nodes expanded (A*) pattern database size (# of abstract states) 8-puzzle: A* vs. PDB size

Korf & Reid (1998) • When the depth bound is d, node n at level j will be expanded by IDA* iff [a] parent(n) was expanded [b] g(n)+h(n)  d, in other words h(n)  d-j • [b]  [a] if the heuristic is consistent • Total nodes expanded = N(j)*P(j,d-j) • N(j) = # nodes at level j in the brute-force tree • P(j,x) = percentage of nodes at level j with h()  x

Korf & Reid – experiment In their 8-puzzle experiment: • Use exact N(j) • Approximate P(j,x) by EQ(x) = limit (j) P(j,x) • IDA*, but complete enumeration of last level • Run all 181,400 start states to all depths

Korf & Reid – results Seems the ideal tool for choosing which of two PDBs is better…

Korf & Reid – stopping at goal For choosing which of two PDBs is better in a practical setting, adaptations are needed.

Using Multiple Abstractions • Given 2 consistent heuristics, max(h1(s),h2(s)) is also consistent. • In some circumstances, can add them. • How good is max ? • hope it is at least 2x because it takes 2x the space

Max of 2 random PDBs max(h1,h2) worse than h1

Instead of max - interleave use PDB1 use PDB2 use PDB2 use PDB1 use PDB1 use PDB1 use PDB2 use PDB2 use PDB1

Interleaved Pattern Databases • The hope: almost as good as max, but only half the memory. • Intuitively, strict alternation between PDBs expected to be almost as good as max. • How to generalize this to any abstraction of any space ?

Max Interleave h1 2 random PDBs interleaved • 93 random pairs (with non-trivial LCA) • 4 had Max(h1,h2) > h1 • 17 others had Interleave(h1,h2) > h1 • The remaining 72 were “normal”

Relative Performance Max Interleave h1

Current Research • Istvan Hernadvolgyi (Ph.D. student, U. Ottawa) • automatic creation of good pattern databases • adaptation to weighted graphs Project Students (U of A) • Jack Newton • max of two pattern databases • interleaved pattern databases • Daniel Neilson - additive abstractions • Ajit Singh – predicting IDA* performance

Future Research • compression of pattern databases • understand & avoid non-surjectivity • alternative methods of abstraction • projection

Pattern Databases