260 likes | 400 Views
Falcon on a Cloudy Day. A Ro Sham Bo Algorithm by Andrew Post. Lets Review. If you missed my previous presentation: Ro Sham Bo = Rock Paper Scissors Can be more complicated though Ro Sham Bo has important applications Algorithms compete at Ro Sham Bo in tournaments
E N D
Falcon on a Cloudy Day A Ro Sham Bo Algorithm by Andrew Post
Lets Review • If you missed my previous presentation: • Ro Sham Bo = Rock Paper Scissors • Can be more complicated though • Ro Sham Bo has important applications • Algorithms compete at Ro Sham Bo in tournaments • Iocaine Powder is the world champ of Ro Sham Bo • Because it uses ‘Sicilian Reasoning’ • I will beat Iocaine Powder • Eventually…
What is Ro Sham Bo? • Also known as Rock Paper Scissors
What is Ro Sham Bo? • Generalized case of Rock Paper Scissors actually • Not always three choices • Ties can be resolved differently • The game is not necessarily zero-sum
Why does it matter? • Many competitive scenarios involve a Ro Sham Bo • Example: • CBS and NBC choosing Primetime TV Shows • They can choose to show a Drama, Comedy, or Sports show • Viewers prefer Comedy to Drama, Sports to Comedy, and Drama to Sports, given the choice. • Neither station knows ahead of time what the other will choose • Billions of dollars every day rely on decisions like these.
How it works • Simplest Non-Cooperative Game • Players cannot play to ensure they both win • Governed by the Nash Equilibrium • There are strategies which cannot be dominated • http://www.youtube.com/watch?v=pdrBDfRvpBA1:31 -- 2:20
How to Win • As you just heard, playing randomly can ensure you don’t lose, but how do you win? • How to predict your opponent • Sub-Optimal Frequency Distributions • Pattern Matching • History Analysis
Iocaine Powder • International Ro Sham Bo Programming Tournament Champion • Named for this famous scene:http://youtube.com/watch?v=TUee1WvtQZU0:57 -- 2:20
The Tournament • Tournament programs play thousands of rounds • Win by beating the most opponents by a large margin • Most programs play sub-optimally, so exploiting your opponent is more important than playing randomly to avoid losing.
Iocaine Powder • IP is the algorithm which does this best. • IP uses the same heuristics to predict what an opponent is most likely to do. • Using the same tools, how can you be better? Sicilian Reasoning!
Sicilian Reasoning • Levels of second guessing: • Opponent will play rock, so play paper • Opponent knows you will counter rock with paper, and play scissors – so play rock • Opponent knows all this, and will now play paper to beat your rock – so play scissors • Opponent will play rock again – same as 1
Sicilian Reasoning • Use your predictive strategies to evaluate what is going to happen next. • Run SR on yourself and your opponent, and keep a table of what each of the six levels of reasoning say you should do. • Pick the level of reasoning which would have won against what your opponent actuallydid the most often.
Wait, six? Don’t you mean three? • You can use the same predictive tools that your opponent uses to ‘predict’ what you are going to do. • Now you have three more levels of SR: 4. I will play rock. So he plays paper. So play Scissors 5. He knows I will counter with scissors, and play rock. So play Paper. 6. He expects me to counter-counter with paper, and will play scissors. So play rock.
More Sicilian Reasoning • Just because one level of SR is winning now, doesn’t mean it always will be. • Opponents will change how they play if they are losing, so you must change too! • How do you switch your level of SR?
Switching Reasoning • SR-2 has just won the first 100 rounds • Opponent changes strategy • You lose 50 rounds before SR-4 has more than 100 theoretical wins. • You just wasted 50 rounds!
Switching Reasoning • Use several different methodologies for switches • Most wins in last 10, 25, 50, 100, 1000 rounds • Has won the most in similar situations • Causes the opponent to switch to a worse strategy
Switching Reasoning • Here is the real genius – now use the switching methodology which has helped you win the most rounds!
Falcon on a Cloudy Day • So you ask, how do you beat Iocaine Powder? • Improve the basic predictive heuristics • Extend Sicilian Reasoning
Improving Prediction • What I have implemented: • Improved Variable History Analysis • Look at just your history, your opponents, or both • Improved Frequency Analysis • EV[x] = Pr[x+2] - Pr[x+1]
Demonstration • Here is how my project does with what is implemented so far.
Improving Prediction • What I have not implemented yet: • Improved Pattern Matching • Markov Models with MegaHAL • Extended Sicilian Reasoning
More on MegaHAL • MegaHAL is a very simple "infinite-order" Markov model. • Stores frequency information about the moves the opponent has made in the past for all possible contexts • Using the ‘context’ of the last few moves, the “appropriate” response is then selected.
Extended Sicilian Reasoning • Q: Isn’t Sicilian Reasoning complete at 6? • A: Yes, but there is information we are ignoring. • By compressing your strategy decisions into the idea of which of six strategies is best right now, you have no way to keep track of how changing your strategies has paid off best in the past.
Now for some Math • Hilbert Space • Game Trajectory and Game State • Projection Operators • Annotated History Analysis • Project Enigma