280 likes | 397 Views
Jonathan Rubin & Ian Watson University of Auckland Game AI Group http://www.cs.auckland.ac.nz/research/gameai/. SARTRE: System Overview A Case-Based Agent for Two-Player Texas Hold'em. Overview. Introduction Texas Hold'em Approaches to Computer Poker Sartre: System Overview Results
E N D
Jonathan Rubin & Ian Watson University of Auckland Game AI Group http://www.cs.auckland.ac.nz/research/gameai/ SARTRE: System OverviewA Case-Based Agent for Two-Player Texas Hold'em
Overview • Introduction • Texas Hold'em • Approaches to Computer Poker • Sartre: System Overview • Results • Future Work
Texas Hold'em • Two-player Limit Hold'em • Much different to full-table game • Chance events • Hidden Information
Approaches to Computer Poker • Near-Equilibrium Strategy • Exploitative Strategy
Near-Equilibrium Strategy • Nash Equilibrium • Assumes the opponent makes no mistakes • Attempts to minimise its loses against this perfect opponent • Near-Equilibrium • As game tree is too large • Plays not to lose
Exploitative Strategy • Exploitative Strategy • Opponent Modelling • Attempts to punish weaknesses in the opponents strategy • Plays off the equilibrium • Plays to win
Sartre: System Overview • Similarity Assessment Reasoning for Texas hold'em via Recall of Experience • Our entry for the 2009 Computer Poker Competition • Case-base was constructed from past CPC games
Sartre: System Overview • Hand picked by authors • Case Features • Previous betting for the hand • Hand Category • Board Category
1. Previous betting for the hand • Currently represented as a string • f = fold • c = check/call • r = bet/raise • Examples • r • rrc-r • rc-crrc-rc-cr
2. Hand Category • Rule-based System
2. Hand Category • Two components • Hand Category • Hand Potential • Examples • Missed • One-Pair, Two-Pair, Three-of-a-kind • Flush-draw, Straight-draw
3. Board Category • Captures information about potential • Flush Draws or, • Straight Draws • Information that is likely to be noticed by an good player
3. Board Category • Flush Highly Possible
3. Board Category • Straight Possible
Similarity • Currently either all or nothing • If a collection of cards maps to the same category they are assigned a similarity of 1.0, otherwise 0.
Case Overview • Case Features • 1. Previous betting for the hand • 2. Hand Category • 3. Board Category • Solution • f, c, r • Outcome • +/- value • + Profit • - Loss
Case Overview • Solution + Outcome • Recorded from equilibrium approaching bots from previous AAAI Computer Poker Competition • Separate case-bases for preflop, flop, turn & river • Approx. 250,000 cases in each case-base.
Decision Making • Retrieved cases can have different decisions • Three different versions • 1. Probability Triple • 2. Majority rules • 3. Outcome-based
Decision Making • Probability Triple • Proportion of times that the solution indicated to fold, call or raise • (f, c, r) • Majority Rules • Decision made the most is reused • Outcome-Based • Dependant on adjusted average outcome values for each decision • If a call or raise decision was never made, it's outcome is unknown and is given a value of +infinity
Duplicate Matches • Experimental results derived using duplicate matches • Play N poker hands • Reset each players memory • Reverse the position of each player and deal the same N hands • Forward + Reverse Directions • Reduces variance
Self-Play Experiments • Small bets per hand (sb/h) • Assuming a $10/$20 game • Sartre-Probability Vs. Sartre-Outcome • Sartre-Probability wins 0.168 sb/h • On average $1.68 profit per hand • Sartre-Probability Vs. Sartre-Majority • Sartre-Majority wins 0.039 sb/h • On average $0.39 per hand
Self-Play Experiments • Chose Sartre – Majority Rules. • Results not transitive • Makes Sartre more predictable and hence more exploitable by strong opposition
2009 Computer Poker Competition Results • Duplicate match structure • 3000 hands in forward & reverse direction • Multiple matches against each opponent until statistical significance obtained • Sartre placed 7th out of 13 entrants in limit competition
2009 Computer Poker Competition Results • Overall profit of +0.097 sb/h • Assuming a $10/$20 game • $0.97 per hand profit
Future Work • Investigate loosening of all-or-nothing similarity • CBR and adaptive poker agents • Opponent modelling • Learning • Better solution adaptation • Combination of decision + outcome