150 likes | 244 Views
Can ADP Bring Down The House? 6.231 Project Presentation December 14, 2012. By . Cristian Figueroa. Outline. Introduction Blackjack Analysis Model S-ALP Results Conclusion. Project Objective.
E N D
Can ADP Bring Down The House?6.231 Project PresentationDecember 14, 2012 By. Cristian Figueroa
Outline • Introduction • Blackjack • Analysis • Model • S-ALP • Results • Conclusion
Project Objective • Analyze the effect of using ADP to obtain a “winning” policy to the popular casino game: Blackjack
Blackjack Rules • The dealer deals 2 cards to each player. • Player then: • Hits: Receives and additional card • Stands: Finishes his turn. • Double Down: Raises bet 100%, hits and stands. • Surrender: Give up, receive 50% of the bet back. • Dealer adds cards as necessary until her score reaches 17 or more. • The player with the score closest to 21, but not over 21, wins.
Blackjack Flowchart Player Dealer Stand > 16 Player Busted Hit Payoff Hit Dealer Turn Deal Player Bets Double Down Stand Surrender
Modeling Challenges • Two main challenges when modeling this game: • Decisions depend on previous actions. • How much to bet is decided before seeing the cards. • The number plays that a game last is a random variable. • Consequences: • Hard to obtain and define a stationary policy. • Discounted schemes may have unexpected outcomes.
Blackjack Model Player: First Action Post Game (if applies) Dealer Plays Hit S > 16 Stand H Payoff Busted Next bet DD Stand Hit Su Deal
Blackjack model • States: • Controls • Costs
Smoothed Approximate Linear Program • To solve the Bellman Equation: • Approximate Linear Program suggest solving: • The Smoothed Approximate Linear Program:
Implementation Update q Generate States Through Simulation Sample States Create Constraints Solve S-ALP Simulate Policy <10 seconds Hours Hours Miliseconds <10 seconds
100 Games of 1000 Hands S-ALP Policy: 11.5 Wikipedia Strategy: -11.355
Conclusions • High variability in the outcomes require a large number of simulations in order to get reliable results. • Even though the model works correctly, in games where stages are semi independent one-step look ahead, or rollout probably work better. • Checking your policy with common sense helps for realizing flaws in the model. • The choice of basis functions is just as important as obtaining accurate values.
Blackjack model • States: • Controls • Costs