720 likes | 969 Views
INTELLIGENT DATABASES GAME THEORY PART 4 TASTE AND UTILITY. Peter van Emde Boas ILLC-FNWI-UvA Bronstee.com Software & Services B.V. 2003. See: http://staff.science.uva.nl/~peter/teaching/idb03.html. Abe R. Smith. © games workshop. GOLGFAG. © games workshop. © games workshop. 40 / 0.
E N D
INTELLIGENT DATABASESGAME THEORY PART 4TASTE AND UTILITY Peter van Emde Boas ILLC-FNWI-UvA Bronstee.com Software & Services B.V. 2003 See: http://staff.science.uva.nl/~peter/teaching/idb03.html
Abe R. Smith © games workshop GOLGFAG © games workshop © games workshop 40/ 0 10/ 1 66/ 4 6 / 8 -30/ 10 0/ 30 TASTE & UTILITY Howdo you obtain these Numbersand whatdo they Mean?
WHY UTILITY FUNCTIONS? • Backward Induction is based on preferences rather than numbers • Numbers as a mean for expressing preferences works OK when chance moves are absent • In the strictly competetive framework a single numbersuffices (what is good for Thorgrim is bad for Urgat and conversily...)
Rational Preferences Rational Preferences correspond to total orders axioms: Totality a ≤ b or a ≥ b Reflexivity a ≤ a Transitivity a ≤ b & b ≤ c ==> a ≤ c Asymmetry a ≤ b & b ≤ a ==> a ≈ b Strict order a < b <==> a ≤ b and not a ≈ b Intransitive preferences: a ≤ b ≤ c < a ??!
Slicky Bob Alice the Fool ≤ ≤ < Will you trade for ?Yes Will you trade for ?Certainly Will you trade for + $ 1 ?Of course ! Exploiting Intransitivity
Examples of Intransitivity The Scissors - Stone - Paper Game: Scissors cuts Paper Paper wraps Stone Stone blunts Scissors Grimm’s fairy tale: Hans im Glück gold < horse < cow < pig < goose < < grindstone < nothing
Negative Cycles in Currency Exchange Arbitrage 1 $ = 18.4509 Rb 1 Rb = 0.277458 Rp 1 Rp = 0.195337 $ Product of these exchange rates : 0.99999843427 truncated to 6 decimals: 1.00000 1 $ = 18.4509 Rb1 Rb = 0.277621 Rp1 Rp = 0.195337 $ Product: 1.0005859096 ; truncated to 6 decimals: 1.00059 For every 1M$ pumped through this circuit some $ 590 is collected ....
Condorcet Triplets Aggregation of individual preferences to a social preference ordering by voting procedures may result into cycles in preferences: Alice: Dijkstal > Balkenende > Fortuin Bob: Balkenende > Fortuin > Dijkstal Eve: Fortuin > Dijkstal > Balkenende Majority outcome of pairwise elections yields: Dijkstal > Balkenende > Fortuin > Dijkstal This is the point of departure for the Mathematical Theory of Voting and Election procedures.......
9 8 7 3 1 2 5 4 6 A B C Gale’s Roulette Thorgrim selects a roulette wheel and spins it Urgat next selects another wheel and spins it also The player with the higher number wins the game Expected pay-off from each wheel: 1/3 ( 2+4+9 ) = 1/3 ( 1 + 6 + 8 ) = 1/3 ( 3 + 5 + 7 ) = 5
9 8 7 3 1 2 5 4 6 A B C Gale’s Roulette Thorgrim spins wheel A ; Urgat spins wheel C 9 > 7 ; 9 > 5 ; 9 > 3 ; 4 > 3 . Thorgrim wins. 4 < 5 ; 4 < 7 ; 2 < 3 ; 2 < 5 ; 2 < 7 . Urgat wins. Thorgrim wins with probability 4/9 Urgatwins with probability 5/9 so wheel C beats wheel A
9 8 7 3 1 2 5 4 6 A B C Gale’s Roulette wheel C beats wheel A wheel Abeats wheel B wheel Bbeats wheel C INTRANSITIVITY !
Utility Functions Utility function u : --> Rencodes preferences of players: uAlice ( ) ≤ uAlice ( ) <===> ≤Alice uAlice ( ) < uAlice ( ) <===> <Alice Selection of optimal outcome == maximization No intransitivities possible!
Utility and Optimization © Games Workshop © Games Workshop © Games Workshop Golgfag composes her daily meal as a choice of Gobbo’s and Snotlings. Her taste assigns utility 2 to a Snotling and 5 to a Gobbo. Urgat charges her 3 Skulls for a Snotling and 8 Skulls for a Gobbo. For dietary reasons her meal should include at least one Gobbo. Her budget is 27 Skulls. What is her best choice?
Utility and Optimization #Gobbos A: (1,3) : u = 17 ; b = 27 B: (3,2) : u = 16 ; b = 25 C: (6,1) : u = 17 ; b = 26 D: (9,0) : u = 18 ; b = 27 prohibited! X: (6 1/3,0): u = 17 2/3 ; b = 27 is not integral A and C are the optimal choices 4 3 A 2 B X 1 C u=20 b=27 u=10 2 4 6 8 10 12 14 D #Snotlings
Arbitrary Utility Values? For simple outcomes order preservation is all we need. This requirement can always be met since R is dense.... For comparing bundles there are additional constraints on the precise values: uGolgfag (Snotling) < uGolgfag (Gobbo) holds also when uGolgfag (Snotling) = 6 & uGolgfag (Gobbo) = 18 but these utilities don’t express that Golgfag is willing to trade 5 Snotlings for 2 Gobbo’s Utilities illustrate preferences; they don’t cause them!
Russian Roulette: version 1 L/W 1/6 D/W 1/6 W/L © Morris Bob Jack 1/6 W/D D/W Chicken Out 1/6 L/W 1/6 Click! D/W W/D D/W 1/6 Blam !!! W/L D/W W/D D/W W/D L/W D/W W/D D/W W/D D/W W/L D/W W/D D/W W/D D/W W/D
Russian Roulette: Version 1 In this version the game is an imperfect Information Game -- See the use of Information Sets in picture Strategies can’t differentiate between nodes in same Information set Meaningful Strategies for Bob and Jack: PPP, PPC, PCX, CXX P: pull ; C: Chicken Out ; X: irrelevant Snag with notion of subgame: a non-singleton information set can’t be the root of a subgame.....
Russian Roulette: Version 2 2/3 4/5 1/3 1/5 W/L D/W 3/4 W/L D/W 1/2 D/W 5/6 W/L 1/4 1/2 1/6 L/W W/D L/W W/D L/W W/D Chicken Out Pull Click! Blam!!! Question: Is this the same game as version 1 ??
Backward Induction ? Backward Induction requires to compare the different lotteries appearing at nodes. How to compare these three outcome lotteries ?? ? 1/2 1/2 >Jack >Bob W/D W/L D/W W/D L/W Round 6 Round 5 2/3 ? 1/3 1/3 == >Jack 1/3 1/3 1/2 1/2 W/L W/D L/W W/D W/L W/D L/W Round 4
Von Neumann-Morgenstern Utility Rational Players may be assumed to maximize the expectation of Something. Let’s call this SomethingUtility. Works nice for 2-outcome Lotteries. So let’s reduce the n-outcome Lotteries to 2-outcome Compound Lotteries: Each intermediate outcome is “equivalent” to a suitable 2-outcome Lottery. The involved chance determines the Utility.
Lot-A p1 pi pn 1 i n Von Neumann-Morgenstern Utility the function u is a Von Neumann-Morgenstern Utility when: 1 For arbitrary lotteries the utility of the lottery is the expected utility of its outcomes: u(Lot-A) = E( u() ) 2 the preferences of the player are faithfully reflected by u
q 1-q p 1-p W L W L 2-Outcome Lotteries Lot-1 Lot-2 q := p := u(W) = b u(L) = a a<b ELot-2( u ) = q.b + (1-q).a ELot-1( u ) = p.b + (1-p).a p > q<==> p > q <==> ELot-1( u ) > ELot-2( u )
p 1-p W L Utility Intermediate Outcome Lot-1 Lot-3 p := u(W) = b u(L) = a u(D) = x a<b D ELot-3( u ) = x ELot-1( u ) = p.b + (1-p).a If p is large (almost 1) : Lot-1 > Lot-3 For psmall (almost 0) : Lot-1 < Lot-3 So for some intermediate p, say q: Lot-1 ≈ Lot-3 q ≈ Lot-3 whence u(D) = q.b + (1-q).a !
Utility Lottery = Expected Utility Outcomes Lot-1 Lot-2 Lot-3 piqi 1- piqi p1 pi pn p1 pi pn ≈ ≈ 1 i n W L qi 1-qi W L u(W) = 1 , u(L) = 0 , u(i) = qi piqi= u(Lot-3) = piu(i) = E Lot-1 u(outcome)
St. Petersburg Paradox - solved Flip a fair coin until H appears Price = 2 #T in this sequence Utility ( 2 k ) = 2 k/2 so H ---> 1 Utility:1 TH ---> 2 √2 TTH ---> 4 2 TTTH ---> 8 2√2 etc. Expectation: 1/2 * 1 + 1/4 * 2 + 1/8 * 4 + 1/16 * 8 + ..... = 1/2 + 1/2 + 1/2 + 1/2 + ..... = Expected Utility: 1/2 * 1 + 1/4 * √2 + 1/8 * 2 + 1/16 * 2√2 + ..... = 1/2 + 1/4 √2 + 1/4 + 1/8 √2 + .... = 1. (1 +1/2 √2) = 1.71.. Expected Utility 1.71 = u( 1.71 * 1.71 ) = u( 2.92 ) The number 2.92 indicates what the player wants to pay!
Utility ≠ Expectation Keep Your Money Asrack’s Sweepstake Easrack(prize) = p. £ 100 Easrack(U) = p. U(£ 100) = = U( q. £ 100 ) p 1-p q.£ 100 £ 100 £ 0 The Player willing to pay q.£ 100 for participating in Asrack’s Sweepstake actually ascribes to this Lottery a Utility corresponding to a chance of winning equal to q ! Observe that for many people q ≠ p ! Is this rational??
Risk Aversion Utility 4 Lot-A 2 U(n) = √ n 1/5 4/5 1.6 1 £ 16 £ 1 1 4 16 Pay-off/ Price E(Pay-off) = 16/5 + 4/5 = 4 E(Utility) = Utility(lot-A) = 8/5 = 1.6 Utility( E (pay-off) ) = 2 > 1.6 : So player prefers money over lottery !!
Risk Averse / Risk Loving Utility Utility Utility Pay-off/ Price Pay-off/ Price Pay-off/ Price Convex f’’ > 0 Risk Loving Affine f’’ = 0 Risk Neutral Concave f’’ < 0 Risk Averse
Invariance of Utility Faithful representation of preferences is preserved under Monotonous Transformation of the Utility functions: E.G., u --> (u+1)3 But what about the Expected Utility property?? Theorem: The only transformations preserving both properties are affine transformations: the utility function u is characterized uniquely by its values u(W) and u(L).
p 1-p W L Invariance Utility Lot-1 Lot-3 p := u(W) = b u(L) = a u(D) = x a<b D ELot-3( u ) = x ELot-1( u ) = p.b + (1-p).a For some intermediate p, say q: Lot-1 ≈ Lot-3 q ≈ Lot-3 whence u(D) = q.b + (1-q).a ! D --> q --> q --> u(D) with only a and b as free parameters
Wait Give-up D p 1-p Jack shows up Jack W L No Jack Jill Waiting for Godot Rational decision: compare u( D ) ?? p.u( W ) + (1-p).u( L ) = p This comparison involves both Jill’s Taste (expressed by the utility u) and her Belief (expressed by the subjective probability p). Bayesian Rationality ...??? The sure-thing principle .... ???
The Game of Chaos Revisited WHY WOULD YOU EVEN CONSIDER TO PLAY THIS CARD? 1) Play it when you have more life than your opponent 2) Play it as a last stand Can we validate this by a calculation ??
Translation Sorry: it still is a French Card Game of Chaos Sorcery Play head or tails against a target opponent. The looser of the game looses one life. The winner of the game gains one life, and may choose to repeat the procedure. For every repetition the ante in life is doubled. © Wizards of the Coast, inc.
Game of Chaos 3 Denotes 3 / -3 1/2 1/2 X Y © Wizards of the Coast, inc. X Y Structure of the game tree independent of the choice of the utilities. uT,1: uT,1(n) = n uT,2: uT,2(n) = if n ≥ vopp then 1 elif n ≤ - vselfthen -1 else 0 fi uU,1: uU,1(n) = -n uU,2: u U,2(n) = if n ≥ vself then - 1 elif n ≤ - vopp then 1 else 0 fi
Linear Utilities Thorgrim and Urgat both start with 5 lives 0/0 1/-1 -1/1 -1/1 3/-3 1/-1 -3/3 7/-7 -1/1 3/-3 -3/3 1/-1 -5/5 5/-5 -7/7 7/-7 -9/9 11/-11 -5/5 5/-5 -11/11 9/-9 -7/7 Both Thorgrim and Urgat use utility u1
Go for the Kill! Thorgrim and Urgat both start with 5 lives 0 0/0 1 -1 0/0 0/0 -1 -3 3 1 -.5/.5 .5/-.5 .5/-.5 -.5/.5 7 -1 3 -5 5 -3 1 -7 0/0 0/0 0/0 0/0 1/-1 -1/1 1/-1 -1/1 7 -9 11 -5 5 -11 9 -7 1/-1 -1/1 1/-1 -1/1 1/-1 -1/1 1/-1 -1/1 Both Thorgrim and Urgat use utility u2
Mixed Utilities Thorgrim and Urgat both start with 5 lives 0 0/0 1 -1 0/-1 0/1 -1 -3 3 1 -.5/1 .5/-1 .5/-3 -.5/3 7 -1 3 -5 5 -3 1 -7 0/1 0/-3 0/3 0/-1 1/-7 -1/5 1/-5 -1/7 7 -9 11 -5 5 -11 9 -7 1/-7 -1/9 1/-11 -1/5 1/-5 -1/11 1/-9 -1/7 Thorgrim uses u2 ; Urgat uses u1
Winning is all Thorgrim and Urgat both start with 5 lives 0 .5/.5 1 -1 .5/.5 .5/.5 -1 -3 3 1 .25/.75 .75/.25 .75/.25 .25/.75 7 -1 3 -5 5 -3 1 -7 .5/.5 .5/.5 .5/.5 1/0 0/1 1/0 .5/.5 0/1 7 -9 11 -5 5 -11 9 -7 1/0 0/1 1/0 0/1 1/0 0/1 1/0 0/1 Utilities: Thorgrim uses u3,T: u3.T(n) = if n ≥ voppthen 1 else 0 fi Urgat uses u3,U: u3.U(n) = if - n ≥ voppthen 1 else 0 fi
Unequal Start Thorgrim: 6 lives Urgat: 4 lives utilities used u2 .125/-.125 0 1 -1 0/0 .25/-.25 -1 -3 3 1 .5/-.5 .5/-.5 0/0 -.5/.5 7 -1 3 -5 5 -3 1 -7 0/0 .5/-.5 -.5/.5 0/0 0/0 1/-1 1/-1 -1/1 7 -9 11 -5 3 -13 5 -11 9 -7 1/-1 -1/1 1/-1 0/0 0/0 -1/1 1/-1 -1/1 1/-1 -1/1 11 -21 17 -13 1/-1 -1/1 1/-1 -1/1
Thorgrim’s last stand Thorgrim: 1 live Urgat: 6 lives .125/.75 0 -1 1 .25/.5 0/1 -1 3 .5/0 0/1 7 -1 1/-1 0/1 Utilities: Thorgrim uses u3: u3(n) = if n ≥ voppthen 1 else 0 fi Urgat uses u2
PREFERENCES, UTILITIES and RATIONALITY Power and Limitations of the von Neumann Morgenstern Utility Theory
The Question Utility Theory provides us with a explanation of preferences of agents which transgresses simple maximizing expected pay-offs. Can there be a still more general explanation supporting even more general preferences ? OR Will such more general framework violate basic rules concerning the mere concept of rationality ?
Result of Poll: 99 00 01 02 < 2 2 7.5 10 < > 3 1 2.5 1 < < 3 0 2 0 > > 2 13 0 1 > Allais’ Paradox - Savage’s Error 0 1 0 ?? 0.01 0.89 0.10 $0M $1M $5M $0M $1M $5M ?? 0.9 0 0.1 0.89 0.11 0 $0M $1M $5M $0M $1M $5M Take u($0M) = 0; u($5M) = 1; u($1M) = x > in first pair : x > 0.89x + 0.1 <==> x > 10/11 < in second pair: 0.11x < 0.1 <==> x < 10/11 Hence the opinions ( < , > ) and ( > , < ) are inconsistent with the Von Neumann-Morgenstern Utility theory
Rationality Principles If Lottery P is preferred over Q then this preference also holds in the context of a Compound Lottery. P > Q ==> P + (1-) C > Q + (1-) C Independence Sure-Thing ==> ≈≈ 1- 1- P Q Q + (1-) C P + (1-) C P C Q C
Rationality Principles If Lottery P is preferred over Q then this preference also holds in the context of a previous choice event which didn’t occur Forgone Event Independence Commit Here Commit Here 1- 1- C C ==> ==> P Q P Q P Q This preference remains when asked to commit on it before the chance move occurs Dynamic Consistency
Rationality Principles Context Independence Preferences are preserved when decisions are drawn at the position they are committed to Commit Here 1- ≈≈ C ==> 1- 1- P C Q C P Q Q + (1-) C P + (1-) C Reduction Preferences are preserved when compound lotteries are expanded to simple ones
Rationality Principles No Split Personality Don’t believe in ghosts Forgone Event Independence Context Independence } { Independence Sure-Thing <== Dynamic Consistency Reduction Bayesian Rationality Follow Probability Calculus Don’t judge a book by its cover
Rationality Principles Independence P Q P + (1-)C Q + (1-) C FEI Reduction Commit Here Commit Here 1- 1- C C DC CI 1- 1- P Q P Q P C Q C
Allais’ Example Independence 1/11 10/11 .89 .10 1.00 .01 10M$ 0M$ 50M$ 10M$ 0M$ 10M$ 50M$ FEI Reduction Commit Here Commit Here .89 .89 .11 .11 CI DC .89 .89 .11 .11 10M$ 10M$ 10M$ 10M$ 10M$ 1/11 10/11 1/11 10/11 1/11 10/11 10M$ 0M$ 50M$ 10M$ 0M$ 50M$ 0M$ 50M$