350 likes | 628 Views
Game Theory. 2013/03/28. Games With Perfect Information. Alternating move, complete information, … => 2-player games Use minimax, alpha-beta, … to find optimal moves. Games Theory. When Simultaneous moves Partial information Stochastic outcomes Relates to Auctions
E N D
Game Theory 2013/03/28
Games With Perfect Information • Alternating move, complete information, …=> 2-player games • Use minimax, alpha-beta, … to find optimal moves
Games Theory • When • Simultaneous moves • Partial information • Stochastic outcomes • Relates to • Auctions • Product development, pricing decisions • National defense
Game Theory Can Be Used In… • Agent design: • Analyze the agent’s decisions, and compute the expectedutility • Mechanism design: • Define the rules of the environment • Agent maximizing own utility will maximize collective good
Definition • Players • Make decisions • Actions • Players can choose • A payoff matrix • Gives the utility to each player
Two-finger Morra • Two players: O,E, O plays 1 or 2, E plays 1 or 2, simultaneously • Pay off matrix • What should E do? … O do? • No single-action works…
Player Strategy • Strategy / policy • Should “rational” • Pure strategy / deterministic action • Eg. O plays “two” • Mixed strategy • Eg. [0.3: one; 0.7 two] • Strategy profile / strategy of each player • Eg. O[0.3: one; 0.7 two] E[0.9: one; 0.1: two]
Zero-sum Games • Player#1’s gain = Player#2’s loss • Games are not always 0-sum games • Some games are not 0-sum games • Single action-pair can benefit both • Single action-pair can hurt both
Prisoner’s Dilemma • Alice, Bob arrested for burglary and interrogated separately
Prisoner’s Dilemma, con’t • What should A do? • Testify ( why? ) • Testify is dominant strategy • Clearly B should testify also • <A:testify; B:testify> is dominant strategy equilibrium • Payoff: A=-5, B=-5
Prisoner’s Dilemma, con’t • But <A:refuse; B:refuse> is better • Jointly preferred outcome occurs when each chooses individually worse strategy • Paretto optimal • An equilibrium is a local optimum in the space of policies
Why not <A:refuse; B:refuse>? • Not “equilibrium”: • If A knows that B: refuse, then A: testify • A has incentive to change • Strategy profile S is Nash equilibrium • player P, P would do worse if deviated from S[P], when all other players follow S • Every game have a Nash Equilibrium • “Equlilbrium Points in N-person Games,” Proceedings of the National Academy of Sciences 36 (1950): 48-49. 1996 Nobel prize
DVD vs. CD • Acme: video game hardware • Best: video game software • Both win if both use bluray, Both win if both use dvd
DVD vs. CD • Which Nash equilibrium? • < bluray, bluray > as Paretto Optimal • Sometimes >1 Paretto optimal Nash equilibrium • Set payoffs for < bluray, bluray > to 5 • Solution: need to communicate • Coordination games
Pure strategy Nash equilibrium • No pure strategy • Else O could predict E, and beat it • optimal mixed strategy for 2-player, 0-sum game
Von Neumann’s Maximin Technique,1928 • Let U(e,o) be payoff to E if E:e, O:o, • Payoff to O is –U(e,o) • So E is maximizing, O is minimizing
Mixed Nash Equilibrium • Spse E plays [p:one; (1-p):two] • For each fixed p, O plays pure strategy • If O plays one, payoff is p*U(one,one)+(1-p)*U(one,two) =2p + -3(1-p)=5p-3 • If O plays two, 4-7p • For each p, O plays • One if 5p-3 4-7p • Two if 5p-3 > 4-7p • One and Two must have the same expected utility for agent • E should play [7/12:one; 5/12:two] • Utility is –1/12
What about O? • Spse O plays [q:one; (1-q):two] • For each q, E plays • One if 5p-3 4-7p • Two if 5p-3 > 4-7p • One and Two must have the same expected utility for agent O • q=7/12 • O should play [7/12:one; 5/12:two] • Utility is –1/12 • Mixed strategy [7/12:one; 5/12:two], • True utility of the game is –1/12
Repeated games • Meet again, exactly 100 rounds • 500 years • 99% meet again, #expect rounds= 100 • Perpetual punishment • Tit-for-tat(highly robust and effective)
Mechanism Design • Inverse game theory • Define the rules of the environment • Agent maximizing own utility will maximize collective good • Eg. • Design protocols for Internet traffic routers to maximize global throughput • Auction off cheap airline tickets • Assign medical intern to hospitals • Get soccer players to cooperate
Define Mechanism • Set of strategies each agent may adopt • Outcome rule G determining payoff for any strategy profile of allowable strategies
Tragedy of the Commons • Every farmer can bring livestock to town commons • Destruction from overgrazing…negative utility to all farmers • Solution: setting prices • Ensure that all externalities effects on global utility • What is correct price?
Tragedy of the Commons • A: other farmers, B: individual farmer
Free Rider • A: other people, B: I Other people
Game of Chicken • A: other farmers, B: individual farmer
Volunteer’s Dilemma • A: other people, B: I Other people
Solution • Morality • Powerful Judge • Suitable Mechanism
Auctions • Mechanism for selling goods to individuals • Single “good” • Each bidder Qi has utility vi for good • Only Qi know vi
English Auction(ascending-bid) • Auctioneer increments prices of good, until only 1 bidder remains • Bidder with highest vi gets good, at price bm+d • Strategy for Qi: • Bid current price p if p vi • Dominant as independent of other’s strategy • Strategy-proof mechanism: • Players have dominant strategy • But… High communication costs! • How to set reserve price? High or low?
Sealed Bid Auction • Each player posts single bid to auctioneer • Qi with highest bid wins, Qi pays bi, gets good • Q: should Qi bid vi? • A: not dominant! • Better is min{vi, bm+ε }, bm is max of others • Drawbacks: • Player with highest vi might not get good, so seller gets too little!, so “wrong” bidder gets good! • Bidders spend time contemplating others
Sealed-Bid 2nd-Price Auction • Each player posts single bid to auctioneer • Qi with highest bid wins, Qi pays bm, gets good, bm is 2nd highest bid • “Vickrey Auction”: • William Vickrey, 1996 Nobel prize in economics • Simplicity and the minimal computation requirements
Should Qi bid vi? • Yes, is dominant!