790 likes | 1.13k Views
Game Playing. Human Game Playing Intellectual Activity Skill Comparison. Computer Game Playing Testbed for AI Limitations. General Game Playing. General Game Players are systems able to play arbitrary games effectively based solely on formal descriptions supplied at “runtime”.
E N D
Game Playing Human Game Playing • Intellectual Activity • Skill Comparison Computer Game Playing • Testbed for AI • Limitations
General Game Playing General Game Players are systems able to play arbitrary games effectively based solely on formal descriptions supplied at “runtime”. Translation: They don’t know the rules until the game starts. Additional Complexities: uncertainty (moves of other players) resource bounds (game clock)
Technology Unlike specialized game players (e.g. Deep Blue), they do not use algorithms designed in advance for specific games. Artificial Intelligence Technologies knowledge representation reasoning, reformulation, learning rational behavior w/ uncertainty, resource bounds
Annual AAAI GGP Competition Annual AAAI Competition Held at AAAI conference Administered by Stanford (Stanford folks not eligible to participate) Reward 2005 - Jim Clune 2006 - Stephan Schiffel and Michael Thielscher 2007 - Yngvi Bjornsson and Hilmar Finsson 2008 - Yngvi Bjornsson and Hilmar Finsson
Carbon versus Silicon Come cheer on your favorite. The battle continues at AAAI-2007.
Single Player Game as a State Machine a b s2 s5 s8 a a b a a b d b b s1 s3 s6 s9 s11 d a c a a c b b a s7 s4 s10
Composite Actions for Multiple Players aa ab s2 s5 s8 aa bb ab aa aa ab bb ab ab s1 s3 s6 s9 s11 ba aa ba ba ab bb bb aa aa s4 s7 s10
Direct Description Since all of the games that we are considering are finite, it is possible in principle to communicate game information in the form of transition graphs. Problem:Size of description. Even though everything is finite, the graphs can be large. Ideas: 1. Different Conceptualization 2. Compact Encoding
s1 Propositions Decompose states into “propositions”. Benefit - n propositions can represent 2n states. p q s r
Propositional Net Components Propositions Connectives Transitions p q r
p r q Propositional Net
x31 o11 x21 o21 x11 o31 o32 x22 x12 x32 o22 o12 or1 xr2 xr1 or3 xr3 or2 o33 x13 o23 x23 x33 o13 Propositional Net Fragment
Propositional GDL row1o :- row1x :- true(11o), true(11x), true(12o), true(12x), true(13o) true(13x) row2o :- row1x :- true(21o), true(21x), true(22o), true(22x), true(23o) true(23x) row3o :- row1x :- true(31o), true(31x), true(32o), true(32x), true(33o) true(33x)
Relational GDL • row(M,P) :- • true(cell(M,1,P)) & • true(cell(M,2,P)) & • true(cell(M,3,P))
Tic-Tac-Toe init(cell(1,1,b)) init(cell(1,2,b)) init(cell(1,3,b)) init(cell(2,1,b)) init(cell(2,2,b)) init(cell(2,3,b)) init(cell(3,1,b)) init(cell(3,2,b)) init(cell(3,3,b)) init(control(x)) legal(P,mark(X,Y)) :- true(cell(X,Y,b)) & true(control(P)) legal(x,noop) :- true(control(o)) legal(o,noop) :- true(control(x)) next(cell(M,N,P)) :- does(P,mark(M,N)) next(cell(M,N,Z)) :- does(P,mark(M,N)) & true(cell(M,N,Z)) & Z#b next(cell(M,N,b)) :- does(P,mark(J,K)) & true(cell(M,N,b)) & (M#J | N#K) next(control(x)) :- true(control(o)) next(control(o)) :- true(control(x)) terminal:- line(P) terminal:- ~open goal(x,100) :- line(x) goal(x,50) :- draw goal(x,0) :- line(o) goal(o,100) :- line(o) goal(o,50) :- draw goal(o,0) :- line(x) row(M,P) :- true(cell(M,1,P)) & true(cell(M,2,P)) & true(cell(M,3,P)) column(N,P) :- true(cell(1,N,P)) & true(cell(2,N,P)) & true(cell(3,N,P)) diagonal(P) :- true(cell(1,1,P)) & true(cell(2,2,P)) & true(cell(3,3,P)) diagonal(P) :- true(cell(1,3,P)) & true(cell(2,2,P)) & true(cell(3,1,P)) line(P) :- row(M,P) line(P) :- column(N,P) line(P) :- diagonal(P) open :-true(cell(M,N,b)) draw :- ~line(x) & ~line(o)
No Built-in Assumptions • What we see: • next(cell(M,N,x)) :- • does(white,mark(M,N)) & • true(cell(M,N,b)) • What the player sees: • next(welcoul(M,N,himenoing)) :- • does(himenoing,dukepse(M,N)) & • true(welcoul(M,N,lorenchise))
Logic Possible to use logical reasoning for everything Computing legality of moves Computing consequences of actions Computing goal achievement Computing termination Easy to convert from logic to other representations Simplicity of logical formulation Simple, widely published algorithms 3-4 orders or magnitude speedup no asymptotic change
X O X State Representation cell(1,1,x) cell(1,2,b) cell(1,3,b) cell(2,1,b) cell(2,2,o) cell(2,3,b) cell(3,1,b) cell(3,2,b) cell(3,3,x) control(black)
Legality Computation legal(W,mark(X,Y)) :- cell(1,1,x) true(cell(X,Y,b)) cell(1,2,b) true(control(W)) cell(1,3,b) cell(2,1,b) legal(white,noop) :- cell(2,2,o) true(cell(X,Y,b)) cell(2,3,b) true(control(black)) cell(3,1,b) cell(3,2,b) legal(black,noop) :- cell(3,3,x) true(cell(X,Y,b)) control(black) true(control(white)) Conclusions: legal(white,noop) legal(black,mark(1,2)) … legal(black,mark(3,2))
Update Computation cell(1,1,x) cell(1,1,x) cell(1,2,b) cell(1,2,b) cell(1,3,b) cell(1,3,o) cell(2,1,b) black cell(2,1,b) cell(2,2,o) cell(2,2,o) cell(2,3,b) mark(1,3) cell(2,3,b) cell(3,1,b) cell(3,1,b) cell(3,2,b) cell(3,2,b) cell(3,3,x) cell(3,3,x) control(black) control(white) next(cell(M,N,o)) :- does(black,mark(M,N)) & true(cell(M,N,b))
X O X O O X X X X O X O X O X X X O O O O X O X O X X Game Tree Expansion
X O X O X X X X X O X O X O X O O X X X X X O O X O O O X X O X O O X X X X O X O X O X X X O O O O X O X O X X Game Tree Search
Resource Limitations Large state spaces ~5000 states in Tic-Tac-Toe ~1044 states in Chess Limited Resources Memory Time (start clock, move clock)
Incremental Search Incremental Search Expand Game Graph incrementally As much as time allows Minimax/etc. evaluation on non-terminal states using an evaluation function of some sort But how do we evaluate non-terminal states? In traditional game playing, the rules are known in advance, and the programmer can invent an evaluation function. Not possible with GGP.
Non-Guaranteed Evaluation Functions Ideas Goal proximity (everyone) Maximize mobility (Barney Pell) Minimize opponent’s mobility (Jim Clune) Depth Charge (random probe) <insert your good idea here>
Monte Carlo Method / Depth Charge Basic Idea (1) Explore game graph to some level storing generated states (2) Beyond this, explore to end of game making random choices for moves of all players not storing states (to limit space growth) (3) Assign expected utilities to states by summing utilities and dividing by number of trials Features Fast because no search Small space because nothing stored
Monte Carlo 100 0 0 0 0 100 100 0 0 0 0 0 100 0 100 100
Monte Carlo 25 50 75 0 100 0 0 0 0 100 100 0 0 0 0 0 100 0 100 100
Definition Metagaming is the process of analyzing and/or modifying a game offline (e.g. during the startclock or between moves or in a separate thread) so as to improve performance online (once the game begins, recommences, etc.). Examples of Metagaming Techniques: Headstart End Game Book Factoring Dead State Removal And so forth
Hodgepodge Hodgepodge = Chess + Othello Branching factor: a Branching factor: b Analysis of joint game: Branching factor as given to players: a*b Fringe of tree at depth n as given: (a*b)n Fringe of tree at depth n factored: an+bn
Game Reformulation Examples Symmetry, e.g. Tic-Tac-Toe Independence, e.g Hodgepodge Bottlenecks, e.g. Triathalon Techniques: Graph Analysis (factoring, branching factor, …) Proofs (esp those using mathematical induction) Trade-off - cost of finding structure vs savings Sometimes cost varies with size of description rather than size of expanded game graph Use of startclock versus playclock
p1 p2 a1 a2 r1 r2 q1 q2 s1 n1 n2 s2 t1 t2 Example l n3 n4 l1 l2 o1 a3 o2 a4 g
Game Factoring Method: Compute factors Use factors to generate “subplans” Reassemble overall plan from subplans Patterns: Disjunctive Factors * Conjunctive Factors * Interleaved Factors Sequential Factors Overall factors versus conditional factors
Conjunctive Factoring Conjunctive Goals Delete conjunctive goal Make conjuncts goals for each factor Conjunctive Legality Delete legality conjunction Make conjuncts goals for each factor Solve problem, conjoin solutions. Can delete disjunctions too, but treating disjunctions as conjunctions might destroy completeness.
p1 p2 a1 a2 r1 r2 q1 q2 s1 n1 n2 s2 t1 t2 Example l n3 n4 l1 l2 o1 o3 o2 a3 g
Example n3 n4 l1 l2 o1 o2 p1 p2 a1 a2 r1 r2 q1 q2 g1 n1 n2 g2 t1 t2
Performance ? (time (genplan propcompbuttons)) 407 states 1,118 milliseconds. 605,728 bytes of memory allocated. (PROG A B A D E D) ? (time (multiplan propcompbuttons)) 14 states 53 milliseconds 22,320 bytes of memory allocated. (PROG A B A D E D) Partition time: 1 millisecond.