320 likes | 331 Views
This research paper explores the use of DNA as a medium for learning and evolving probabilistic poker strategies. The study aims to obtain strategies for each player in the game by utilizing DNA sequences and genetic algorithms.
E N D
DNA Starts to Learn Poker David Harlan Wood4* Hong Bi1 Steven O. Kimbrough2 Dongjun Wu3 Junghuei Chen1* Departments of 1Chemistry & Biochemistry and 4Computer & Information Sciences University of Delaware 2The Wharton School, University of Pennsylvania 3Benett S. Lebow College of Business, Drexel University
Deal Ace Say Ace (adds $1) Player Dealer Call (adds $1) Fold Losses $ 1 Loses $2 Player Dealt an Ace
Deal 2 Say 2 Say Ace (adds $1) Player Losses $ 1 Dealer Fold Call (adds $1) Losses $ 1 Wins $ 2 Player dealt a 2
Say Ace (adds $1) Say 2 Say Ace (adds $1) Deal Ace 2 Player Losses $ 1 Dealer Call (adds $1) Fold Fold Call (adds $1) Losses $ 1 Losses $ 1 Loses $2 Wins $ 2 Player dealt an Ace Player dealt a 2 OBJECTIVE: To Obtain Probabilistic Strategies Each player wants to obtain a strategy for the game. A strategy prescribes an action in every possible situation. That is, at each node, raising as a function of hand dealt.
Deals Poker Play New Game Assemble New Dealer Strategies New Player Strategies
Amplify Mutate Crossover Amplify Mutate Crossover Learning Recover & Cut Play Histories for Player’s & Dealer’s Strategies Dealer’s Strategies Player’s Strategies Separate by Payoffs Recover & Distribute Strategies Dealer’s Adaptation Programmable Selection of Recovered Dealer Strategies Player’s Adaptation Programmable Selection of Recovered Player Strategies
Deals Amplify Mutate Crossover Amplify Mutate Crossover LearningPoker Recover & Cut Play Histories for Player’s & Dealer’s Strategies Dealer’s Strategies Player’s Strategies Play New Game Separate by Payoffs Assemble Recover & Distribute Strategies Dealer’s Adaptation New Dealer Strategies Programmable Selection of Recovered Dealer Strategies Player’s Adaptation New Player Strategies Programmable Selection of Recovered Player Strategies
Say Ace (adds $1) Say 2 Say Ace (adds $1) Fold’ Say A’ Call’ FOLD’ R.E. 2 Stopper Stopper Player’s Strategies Say 2’ 2’ Say A’ SAY2’ Say A’ A’ Error Fold’ R. E. 1 Stopper Stopper Stopper Deal Ace 2 Player Losses $ 1 Dealer Call (adds $1) Fold Fold Call (adds $1) Losses $ 1 Losses $ 1 Loses $2 Wins $ 2 Dealt 2 Dealer’s Strategies 2 R.E. 2 Dealt A R.E. 1 A R.E. 2 Sequences from: Sakamoto, et. al, DNA4 (1997)
A R.E. 2 A Player’s Strategy Say 2’ 2’ Say A’ SAY2’ Say A’ A’ Error Fold’ R. E. 1 Two Strategies and a Deal Define a Game Ace Dealt A Dealer’s Strategy Fold’ Say A’ Call’ FOLD’ R.E. 1 R.E. 2
Error Fold’ Cut with R.E.1 & R.E.2 and Assemble A Game A Say A’ Call’ Fold’ FOLD’ R.E. 2 Say A’ 2’ Say 2’ Say A’ A’ SAY 2’ R. E. 1 Deal Player’s Strategy Dealer’s Strategy Say A’ Call’ Fold’ FOLD’ Error Fold’ Say A’ 2’ Say 2’ Say A’ A A’ SAY 2’
Error Fold’ A R.E. 2 A Player’s Strategy Say 2’ 2’ Say A’ SAY2’ Say A’ A’ Error Fold’ R. E. 1 Cut with R.E.1 & R.E.2 and Assemble A Game A Say A’ Call’ Fold’ FOLD’ R.E. 2 Say A’ 2’ Say 2’ Say A’ A’ SAY 2’ R. E. 1 Deal Player’s Strategy Dealer’s Strategy Say A’ Call’ Fold’ FOLD’ Error Fold’ Say A’ 2’ Say 2’ Say A’ A A’ SAY 2’ Two Strategies and a Deal Define a Game Ace Dealt A Dealer’s Strategy Fold’ Say A’ Call’ FOLD’ R.E. 1 R.E. 2
Say 2’ 2’ Say A’ Say A’ A’ Error Fold’ Fold’ Say A’ Call’ FOLD’ A SAY 2’ 53-mer (S4) 48-mer (S3) 57-mer (S2) 74-mer (S1) L1 (25 mer) L3 (28 mer) L2 (28 mer) 232 225 200 150 100 75 Deal Player’s Strategy Dealer’s Strategy 50 S1 S2 S3 S4 R1 R2 M R1: Ligation Reaction R2: Purified Ligation Product
Deal Ace Say Ace (adds $1) Say 2 Player Dealer Call (adds $1) Fold Losses $ 1 Loses $2 Player dealt an Ace Player Says A Dealer Folds Dealer MIGHT Change to Call
Dealer’s Strategy Player’s Strategy Deal Say 2’ 2’ Say A’ Say A’ A’ Error Fold’ Fold’ Say A’ Call’ FOLD’ A SAY 2’ Player Dealt an Ace Player Says Ace Extend (Say A) A Say A’ A’ Player’s Strategy Extend (Fold) Dealer Folds Say A Fold’ Say A’ Dealer’s Strategy Extend (Call) Dealer MIGHT Change to Call Fold Preventer Call’ FOLD’ Fold’ Error Dealer’s Strategy
Player Says Ace Extend (Say A) A Say A’ A’ Extend (Fold) Dealer Fold Say A Fold’ Say A’ Dealer MIGHT Change to Call Extend (Call) Fold Preventer Call’ FOLD’ Fold’ (282-mer) 300 (262-mer) 275 250 225 (247-mer) 200 (232-mer)
Player Says 2 Dealer Folds Dealer Changes to Call Deal 2 Say 2 Say Ace (adds $1) Player Losses $ 1 Dealer Fold Call (adds $1) Losses $ 1 Wins $ 2 Player dealt a 2 Player Changes to Say A (Block Say 2)
Player’s Strategy Dealer’s Strategy Deal Extend (Say 2) 2 2’ Say 2’ Player’s Strategy Extend (Say A) Say 2 Say A’ SAY 2’ Player Dealt a 2 2 Say A’ Call’ Fold’ FOLD’ Error Fold’ Say A’ 2’ SAY 2’ Say 2’ Say A’ A’ Player Says 2 Player MIGHT Change to Say Ace Player’s Strategy Dealer Folds Extend (Fold) Say A Say A’ Fold’ Dealer’s Strategy Dealer MIGHT Change to Call Extend (Call) Fold Preventer Call’ FOLD’ Error Fold’ Dealer’s Strategy
Say Ace (adds $1) Say 2 Say Ace (adds $1) Player Says 2 Player MIGHT Change to Say Ace Dealer Folds Dealer MIGHT Change to Call Deal Ace 2 Player Losses $ 1 Dealer Call (adds $1) Fold Fold Call (adds $1) Losses $ 1 Losses $ 1 Loses $2 Wins $ 2 Player dealt an Ace Player dealt a 2 Player Says A Dealer Folds Dealer MIGHT Change to Call
Deals Amplify Mutate Crossover Amplify Mutate Crossover LearningPoker Recover & Cut Play Histories for Player’s & Dealer’s Strategies Dealer’s Strategies Player’s Strategies Play New Game Separate by Payoffs Assemble Recover & Distribute Strategies Dealer’s Adaptation New Dealer Strategies Programmable Selection of Recovered Dealer Strategies Player’s Adaptation New Player Strategies Programmable Selection of Recovered Player Strategies
Amplify Mutate Crossover Learning Recover & Cut Play Histories for Player’s & Dealer’s Strategies Dealer’s Strategies Player’s Strategies Separate by Payoffs Recover & Distribute Strategies Dealer’s Adaptation Programmable Selection of Recovered Dealer Strategies • Strategies are returned • grouped by outcomes: • $ 2, - $ 1, + $ 1, + $ 2. • Select Dealer’s own • Preferred mix of • strategies to be bred Breed by using PCR to restore population size using a variable mutation rate. Crossover by pairwise recombining of “change your mind” regions.
Say Ace (adds $1) Say 2 Say Ace (adds $1) Deal Ace 2 Player Losses $ 1 Dealer Call (adds $1) Fold Fold Call (adds $1) Losses $ 1 Losses $ 1 Loses $2 Wins $ 2 Player dealt an Ace Player dealt a 2 OBJECTIVE: To Obtain Probabilistic Strategies Each player wants to obtain a strategy for the game. A strategy prescribes an action in every possible situation. That is, at each node, raising as a function of hand dealt.
Complexity Our complexity is linear in the number of nodes in the tree # nodes in tree = 2 players + betting rounds At each node, we need a probability distribution giving “level of bet” as a function of “dealt hand”. For us, probability distribution is substituted by probabilistic hybridization of DNA encoded “dealt hand” to adapting “change you mind about folding” region of strategy. The output (if generated) is an adapting “level of bet” region of strategy. Extend next hand bet next hand’ bet’ next’ bet generator hand evaluator
Comparison Koller and Pfeffer derive equilibrium mixed strategies with complexity polynomial in # nodes * # possible deals * 2betting levels • Two-player games only • Don’t exploit weakness of opponent • No dynamics, only equilibrium “Representations and Solutions for Game-Theoretic Problems,” Artificial Intelligence (1997)
P1 Pass Bet $ a P2 Pass Bet $ a F C P3 Bet $ a Pass F C F C F C P1 F C F C F C P2 F C F C P3 3-Player Poker: All Possible Deals 2 2 2 2 Player 1 2 2 2 2 Player 2 2 2 2 2 Player 3 Course of Play C: Call (add $ b) F: Fold
LearningPoker Assemble Amplify Mutate Crossover Amplify Mutate Crossover Separate by Payoffs Play New Game Deals Recover Dealer’s & Player’s Strategies Dealer’s Adaptation New Dealer Strategies Programmable Selection of Recovered Dealer Strategies Player’s Adaptation New Player Strategies Programmable Selection of Recovered Player Strategies
2 2 2 2 2 A A A A 2 2 2 2 2 A A A A
2 2 2 2 A A A A 2 2 2 2 A A A A A A
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
Player Says 2 Player MIGHT Change to Say Ace Dealer Folds Dealer MIGHT Change to Call