470 likes | 783 Views
Combinatorial Games. Martin Müller. Contents. Combinatorial game theory Thermographs Go and Amazons as combinatorial games. Combinatorial Games. Basics Example: Domineering Simplifying games Sums of games Hot games. What is a Game?. 2 players, Left and Right
E N D
Combinatorial Games Martin Müller
Contents • Combinatorial game theory • Thermographs • Go and Amazons as combinatorial games
Combinatorial Games • Basics • Example: Domineering • Simplifying games • Sums of games • Hot games
What is a Game? • 2 players, Left and Right • Set of positions, starting position • Moves defined by rules • Alternating moves • Player who cannot move loses(no draws) Conway's plan:find the simplestpossible definition
Properties of Games • Complete information • Perfect information • No random element (no dice, coin throws, …)
Definition of a Game • Move options of players • Each move leads to a game • Player who cannot move loses G = { L1,…,Ln | R1,…,Rm } { A,B,C | D,E }
Creating Games G = { L1,…,Ln | R1,…,Rm } • Simplest possible game: { | } • Next step: {{ | } | } { | { | }} {{ | } | { | }} • Continue...
Games and Numbers • Insight: some games represent a number of free moves for one player
Infinite Games • Recursion: option leads back to game G = { A,B | C } A = { |G }
Inverse Game • Swap all Left and Right moves • Compute inverse for all options recursively G = { L1,…,Ln | R1,…,Rm }. • Inverse: -G = { -R1,…,-Rm | -L1,…,-Ln } • Property of inverses: -(-G) = G
Examples of Inverses -(0) = -({ | }) = { | } = 0 -(1) = -({0 | }) = { | -0} = { | 0} = -1 -({0|0}) = {-0 | -0} = {0|0}
DomineeringExample • Inverse of domineering position: rotate by 90˚
Classification of Games G > 0 Left wins G < 0 Right wins G = 0 Second player wins G || 0 First player wins
Classification Examples 0 = { | } First player loses { 0 | 0 } First player win { 0 | { 0 | 0 } } Left always wins {{ 0 | 0 } | 0 } Right always wins
Comparing Games • G > H if G - H > 0Left wins difference game • G < H if G - H < 0Right wins difference game • G = H if G - H = 0Second player wins difference game • G || H if G - H || 0First player wins difference game
Canonical Form of Games • Loopfree games have canonical form • Two operations: • Delete dominated options • Reversing reversible options • Apply as long as possible • End result: unique canonical form
Deleting Dominated Options • Example: {2, -5, 6, 3 | -2, 6, 13, -8} = {6|-8} • General problem: compare games • Complete algorithm implemented in David Wolfe's games package
Sums of Games • Two games, G and H • Choice: play either in G or in H G+H = { G+HL, GL+H | G+HR, GR+H } • Example: -5+3 = { -5+3L, -5L+3 | -5+3R, -5R+3 } = {-5+2|-4+3} = {-3|-1} = -2
Fractions • Example: {0|1} + {0|1} = 1 {-1,0|1}={0|1} = 1/2
Hot Games • First player gets extra moves • Both are eager to play • Example: {1|-1} The 2x2 square is hot
Sums of Hot Games • Can be much more complex than summands • Example: a = {1|-1}, b = {2|-2}, c = {3|-3}, d = {4|-4} • Sums: a+b = {{3|1}|{-1|-3}} a+b+c = {{{6|4}|{2|0}}|{{0|-2}|{-4|-6}}} a+b+c+d = {{{10|8}|{6|4}}|{{4|2}|{0|-2}}} |{{{2|0}|{-2|-4}}|{{-4|-6}|{-8|-10}}}
Mean • Mean m • Average outcome • Means add Examples: m(4|-4) = 0 m(6|-4) = 1 m(4|{-4|-10}) = -3/2 m(4|{-4|-20}) = -4 Theorem: m(a+b) = m(a) + m(b)
Temperature • Measures urgency of move • Sum does not become hotter Examples: temp(4|-4}) = 4 temp(4|{-4|-10}) = 11/2 temp(4|{-4|-20}) = 8 temp(4|{-4|-100}) = 8 temp(a+b) max(temp(a), temp(b))
Example • a = 4|-4, b = 5|-5, c = 5 |{-4|-6} • temp(a) = 4, temp(b) = 5, temp(c) = 5 • temp(a + b) = 5 • temp(b + c) = 1 • temp(b + b) = 0
Leftscore and Rightscore • Also called LeftStop and RightStop • Minimax values of game if left (right) plays first • Assumption: play stops in numbers • Base points of thermograph (see next slides)
Thermograph (TG) • Consists of left and right scaffold • May coincide in a mast • Leaf node: TG of numbers are masts • Constructed from TG of followers • Tax right scaffold of left follower by t • Tax left scaffold of right follower by -t • Compute max (min) over all left (right) followers • Cut off above intersection of left, right, add mast
Sente and Gote Thermographs • Three examples • Gote • One-sided sente • Double sente • All examples: leftScore - rightScore = 4. • Appear the same to a local minimax search • But they are very different!
Gote • Game: 4|0 • leftScore 4 • rightScore 0 • Mean: 2 • Temperature: 2
One-sided Sente • Game: 22|4||0 • leftScore 4 • rightScore 0 • Mean: 4 • Temperature: 4
Double Sente • Game: 12|3 || -1|-11.5 • leftScore 3 • rightScore -1 • Mean: 0.5 • Temperature: 7
Extensions (1) • Sub-zero thermography • Problem: hard to check when game is number • extend TG to range [-1..0] • “colored ground” rule for zugzwang-like games • Can now construct TG from options in a uniform way • TG = makeTG(left-option-TGs,right-option-TGs)
Extensions (2) • TG for games including loops • Defined by Berlekamp’s Economists’s view paper • I did the first practical algorithm and implementation • Much more complex… • Caves, hills, bent masts, backward masts,…
Stable and Unstable Positions • Position H in game G is called stable if temperature is lower than all of its ancestors • H is unstable if it has an ancestor with lower temperature • H is semistable if not unstable and has ancestor of same temperature
Subtree of Stable Followers • Root of a game tree is stable by definition • Find first stable node on each line of play • Go on recursively • This subtree of stable followers is a (very good) small summary of the whole game
Mainlines and Sidelines • Given G, play n copies of G optimally • Let n go to infinity • Some lines of play will be played more and more often • Mainlines • Other lines played only finitely often • Sidelines
Stable Followers in Mainlines • Stable mainline gote position: has two stable followers, one for each color • Stable mainline one-sided sente position: • Only stable follower of one color (sente) • In a “rich environment” (e.g. coupon stack), play follows mainlines.
Playing Sum Games • Choose one subgame • Choose move in that subgame • Brute force algorithm: • Compute sum • Find move retaining minimax value • Problem: computing sum is slow
Fast Approximate Methods • Goal: identify good move without computing sum • Two parameters: mean and temperature • Hottest games usually most urgent • Refinement: Thermostrat