BDDs in Planning and General Game Playing

BDDs in Planning and General Game Playing Peter Kissmann and Stefan Edelkamp Graph Search Engineering Schloss Dagstuhl 2009

Structure • BDDs • Symbolic Search • BDDs in Planning • Sequential Optimal Planning • Net-Benefit Planning • Conclusion • BDDs in General Game Playing • Solving Single-Player Games • Solving Two-Player Games • Results • Conclusion BDDs in Planning and General Game Playing

BDDs and Symbolic Search Peter Kissmann and Stefan Edelkamp Graph Search Engineering Schloss Dagstuhl 2009

Binary Decision Diagrams (BDDs) • good variable ordering crucial BDDs in Planning and General Game Playing

Symbolic Search • uses (Reduced Ordered) Binary Decision Diagrams ((RO)BDDs) • set-based search: sets of states and transitions represented as relations • unique representation • no duplicate eliminiation within set required • layered exploration (e.g., BFS): duplicate elimination wrt. previous layers • advantages due to compressed representation: • save RAM • might save time BDDs in Planning and General Game Playing

Symbolic Search • two sets of variables • S for current states • S’ for successor states • expansion of state sets (not single states) as relation • calculation of successors: • calculation of predecessors: • predecessors with at least one successor in states: • predecessors with all successors in states: BDDs in Planning and General Game Playing

BDDs in Planning Peter Kissmann and Stefan Edelkamp Graph Search Engineering Schloss Dagstuhl 2009

Structure • Sequential Optimal Planning • Symbolic Algorithms • Competition Results (IPC-6) • Net-Benefit Planning • Symbolic Algorithms • Competition Results (IPC-6) • Conclusion BDDs in Planning and General Game Playing

Sequential Optimal Planning • Given: Problem <S, O, I, T, c> with • S: set of states • O⊆S x S: operators (actions) • I∈S: initial state • T⊆S: terminal states • c: O → {1, …, C}: action costs • Aim: finding of plan from initial state to one of the terminal states • no action costs: minimal plan (in plan‘s length) → Symbolic (Bidir) BFS • with action costs: Symbolic A* (BDDA*) BDDs in Planning and General Game Playing

BDDA* h g BDDs in Planning and General Game Playing

Competition Results (IPC 6) • Extension (of Gamer(comp) to Gamer): • use of hashmap instead of matrix for large action costs • matrix became too large while being sparse BDDs in Planning and General Game Playing

Net-Benefit • challenge at IPC6 • total plan net-benefit = total achieved goal rewards - total action cost • transformation: goal rewards → costs for violating soft constraints • net-benefit = total violating cost + total action cost • to be minimized BDDs in Planning and General Game Playing

Symbolic Branch-and-Bound Search • Symbolic Breadth-First Branch-and-Bound • by Jensen et al. 2006 • cost-optimal BFS → ignores action-costs • improves upper bound U • initially: sum of cost for violating all soft constraints + 1 • can be represented by a BDD: disjunction of all values from 0 to U • Symbolic Cost-First Branch-and-Bound • expansion according to action-costs, not BFS-layers • action-costs still not part of objective function BDDs in Planning and General Game Playing

Symbolic Net-Benefit • adds total action-costs to objective function • net-benefit = (total action-cost f) + (sum of costs for violated soft constraints) • total-cost not bounded • no BDD representation • but: can use cost-first search‘s buckets • also stores current best net-benefit V • initialized to ∞ BDDs in Planning and General Game Playing

Symbolic Net-Benefit • Algorithm: • start with initial state • check, if goals within current states • take only goals with cost < U • find goal with minimal cost U‘ (and U‘ + f < V) and calculate plan • set U = U‘, V = U‘ + f • calculate successors (image) • sort successors into corresponding buckets (f + 1, …, f + C) • repeat from 2., until no new states found (or all soft constraints satisfied or total action cost ≥ V) • return last generated plan BDDs in Planning and General Game Playing

Competition Results (IPC 6) hsp*p: enumerates all possible soft constraint violations and runs ordinary planner on each sub-instance Mips XXL: external-memory algorithms BDDs in Planning and General Game Playing

Conclusion and Additional Remarks • new set-based algorithm for computing optimal net-benefit • covers cost-optimal search and over-subscribed planning with preferences • Gamer can handle 0-cost actions • additional BFS for 0-cost fixpoint calculation • extension to partial initial states BDDs in Planning and General Game Playing

BDDs in General Game Playing Peter Kissmann and Stefan Edelkamp Graph Search Engineering Schloss Dagstuhl 2009

Structure • Solving Single-Player Games • Solving Two-Player Games • Zero-Sum Games • General Two-Player Turn-Taking Games • Results • Conclusion BDDs in Planning and General Game Playing

General Game Playing - Games • Given a description of a game that is • finite • discrete • deterministic • full information • Games can be • single-player or multi-player • simultaneous or turn-taking BDDs in Planning and General Game Playing

Solving Games • In General Game Playing, rewards for all players • range from 0 to 100 (higher = better) • only in goal states • Solving: find rewards for all states (in case of optimal play) • Solving to • analyze players • play optimally • use as endgame database (if not complete) BDDs in Planning and General Game Playing

Solving Single-Player Games • might use Planning technology, but • in Planning (as in General Game Playing) interested in searching only necessary states • here: solve all states • approach: • calculate reachable states • start at goal states giving reward 100 • apply backward BFS • remove all found states from reachable states • go to goal states giving reward 99 and repeat steps BDDs in Planning and General Game Playing

player 0‘s turn player 1‘s turn lost for player 0 lost for player 1 Solving 2-Player Zero-Sum Games • two backward searches (one for each player j∈ {0,1}): • Start with goal states lost for player j • Find all lost predecessors using two steps: • find preceding states where opponent could take move to state lost for j (pre-image) • find preceding states where any of j’s moves results in state lost for j (strong pre-image) • Repeat double-step, until no new states found BDDs in Planning and General Game Playing

Solving General 2-Player Turn-Taking Games • 101x101-matrix of BDDs • BDD at (i, j) represents states achieving reward i for player 0 and j for player 1 (in case of optimal play) • only 1 backward search • alternating between players within loop BDDs in Planning and General Game Playing

Algorithm-Outline • find all reachable states • initialize reward matrix with goal states • solved states: all states within matrix • while (not all states solved) do • for each player j ∈ {0, 1} do • find all solvable states of j (strongPreImage(solved)) • solve these states (pre-image from matrix’s buckets) BDDs in Planning and General Game Playing

own own 0 0 … … 100 100 0 0 … … opponent opponent 100 100 Order to classify states • problem in general case: order to classify states • maximize own reward (and minimize opponent‘s)? • or maximize difference to opponent‘s reward? • might change during one competition • we chose second case for all examples BDDs in Planning and General Game Playing

0/1 player 0 0 1 2 3 0 0/1 0/3 0/1 1 player 1 0/1 3/1 0/3 0/1 2/0 2 3 2/0 2/0 0/3 player 0‘s turn 2/0 0/1 0/1 player 1‘s turn 0/1 3/1 3/1 3/1 3/1 3/1 0/1 0/1 Example BDDs in Planning and General Game Playing

Results (Reachability Analyses) • Single-Player Games: • Two-Player Games: BDDs in Planning and General Game Playing

Results (Peg Solitaire) • total #reachable: 375,110,246 BDDs in Planning and General Game Playing

Results (Connect Four) • 85 bits to represent one state • 2 bits per cell (blank, red, yellow); 42 cells • 1 bit for active player • originally solved by Allis (’88) • estimate on total #states: 70,728,639,995,483 ≈ 70 x 1012 • complete reachability analysis using BDDs • 12 GB RAM • 2.67 GHz CPU • total time: 5:15 h • total #states: 4,531,985,219,092 ≈ 4.5 x 1012 • explicit representation: ≈ 43.5TB BDDs in Planning and General Game Playing

Results (Two-Player Games) BDDs in Planning and General Game Playing

Conclusion • Solving single-player games and two-player zero-sum games fairly easy • Solving general two-player games involved • first approach (Planning & Games Workshop 2007) very slow • current one needs linear number of pre-images • for playing still too slow • UCT to get good estimates faster • UCT works well with endgame databases • BDDs for complete state space can be used as perfect hash-functions BDDs in Planning and General Game Playing

BDDs in Planning and General Game Playing

BDDs in Planning and General Game Playing

Presentation Transcript

Game Playing

Game Playing

Game Playing

Game Playing

Game-Playing

Game Playing

Game Playing

Game playing

Game Playing

General Game Playing (GGP)

From Specific Game Playing to General Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Future of General Game Playing

General Game Playing Competition Update

General Game Playing

Game Playing

Game Playing