530 likes | 743 Views
Simple Decision Making. Ch 16. Decision Making Under Uncertainty. Many environments have multiple possible outcomes Some of these outcomes may be good; others may be bad Some may be very likely; others unlikely What’s a poor agent’s choice??. ?. ?. a. a. b. b. c. c.
E N D
Simple Decision Making Ch 16
Decision Making Under Uncertainty • Many environments have multiple possible outcomes • Some of these outcomes may be good; others may be bad • Some may be very likely; others unlikely • What’s a poor agent’s choice??
? ? a a b b c c {a(pa),b(pb),c(pc)} • decision that maximizes expected utility value Non-Deterministic vs. Probabilistic Uncertainty {a,b,c} • decision that is best for worst case Non-deterministic model Probabilistic model ~ Adversarial search
Expected Utility • Random variable X with n values x1,…,xn and distribution (p1,…,pn)E.g.: X is the state reached after doing an action A under uncertainty • Function U of XE.g., U is the utility of a state • The expected utility of A is EU[A] = Si=1,…,n p(xi|A)U(xi)
s0 A1 s1 s2 s3 0.2 0.7 0.1 100 50 70 One State/One Action Example U(S0) = 100 x 0.2 + 50 x 0.7 + 70 x 0.1 = 20 + 35 + 7 = 62
s0 A1 A2 s1 s2 s3 s4 0.2 0.7 0.2 0.1 0.8 100 50 70 One State/Two Actions Example • U1(S0) = 62 • U2(S0) = 74 • U(S0) = max{U1(S0),U2(S0)} = 74 80
s0 A1 A2 s1 s2 s3 s4 0.2 0.7 0.2 0.1 0.8 100 50 70 Introducing Action Costs • U1(S0) = 62 – 5 = 57 • U2(S0) = 74 – 25 = 49 • U(S0) = max{U1(S0),U2(S0)} = 57 -5 -25 80
Maximum Expected Utility (MEU) Principle • A rational agent should choose the action that maximizes agent’s expected utility • This is the basis of the field of utility theory • The MEU principle provides a normative criterion for rational choice of action
Basic of Utility Theory In the language of utility theory, the complex scenarios are called lotteries to emphasize the idea that the different attainable outcomes are like different prizes and that the outcome is determined by chance • Suppose a lottery L has two possible outcomes: 1. State A with probability p 2. State B with probability 1-p We write L = [p,A; 1-p,B].
Axioms of Utility Theory A>B means “A is preferred over B” A~B means “no preference between A and B” • Orderability • (A>B) (A<B) (A~B) • Transitivity • (A>B) (B>C) (A>C) • Continuity • A>B>C p [p,A; 1-p,C] ~ B • Substitutability • A~B [p,A; 1-p,C]~[p,B; 1-p,C] • Monotonicity • A>B (p≥q ([p,A; 1-p,B] >~ [q,A; 1-q,B])) • Decomposability • [p,A; 1-p, [q,B; 1-q, C]] ~ [p,A; (1-p)q, B; (1-p)(1-q), C]
Utility functions • Utility functions map states to real numbers. • Utility theory has its roots in economics -> the utility of money • Risk averse • Risk seeking • Certainty equivalent • Risk neutral • Utility scales and utility assessment • Normalization
Some Examples • A or B? • A: 80% chance of $4000 • B: 100% chance of $3000 • C or D? • C: 20% chance of $4000 • D: 25% chance of $3000 • E or F? • E: 1% chance of $1,000,000 • F: 3% chance of $400,000
Some Examples • A or B? • A: 80% chance of $4000 • B: 100% chance of $3000 • C or D? • C: 20% chance of $4000 • D: 25% chance of $3000 • E or F? • E: 1% chance of $1,000,000 • F: 2% chance of $550,000 Risk averse Choose B Risk neutral Choose C Risk seeking Choose E
Decision Networks • A Visual Representation of Choices, Consequences, Probabilities, and Opportunities. • Extend BNs to handle actions and utilities • Also called influence diagrams • Use BN inference methods to decide probabilities. • Perform Value of Information calculations • Decision Trees are popularA Way of Breaking Down Complicated Situations Down to Easier-to-Understand Scenarios.
Decision Networks cont. • Chance nodes: random variables, as in BNs • Decision nodes: actions that an agent can take (as in decision trees) • Utility/value nodes: the utility of the outcome state.
Chance node Decision node Decision 1 Decision 2 Example of a Decision Tree Event 1 Event 2 Event 3
Umbrella Network take/don’t take P(rain) = 0.4 umbrella weather have umbrella forecast P(have|take) = 1.0 P(~have|~take)=1.0 happiness f w p(f|w) sunny rain 0.3 rainy rain 0.7 sunny no rain 0.8 rainy no rain 0.2 U(have,rain) = -25 U(have,~rain) = 0 U(~have, rain) = -100 U(~have, ~rain) = 100
Evaluating Decision Networks • Set the evidence variables for current state • For each possible value of the decision node: • Set decision node to that value • Calculate the posterior probability of the parent nodes of the utility node, using BN inference • Calculate the resulting utility for action • Return the action with the highest utility
Decision Making: Umbrella Network Should I take my umbrella?? take/don’t take P(rain) = 0.4 umbrella weather have umbrella forecast P(have|take) = 1.0 P(~have|~take)=1.0 happiness f w p(f|w) sunny rain 0.3 rainy rain 0.7 sunny no rain 0.8 rainy no rain 0.2 U(have,rain) = -25 U(have,~rain) = 0 U(~have, rain) = -100 U(~have, ~rain) = 100
Go to Graduate School to get my master in CS. Go to work “in the Real World” Another Easy Example • A Decision Tree with two choices.
Easy Example - Revisited What are some of the costs we should take into account when deciding whether or not to go to graduate school? • Tuition and Fees • Rent / Food / etc. • Opportunity cost of salary • Anticipated future earnings
Go to Graduate School to get my master in CS. Go to Work “in the Real World” Simple Decision Tree Model 1.5 Years of tuition: $30,000, 1.5 years of Room/Board: $20,000; 1.5 years of Opportunity Cost of Salary = $100,000 Total = $150,000. PLUS Anticipated 5 year salary after Graduate School = $600,000. U(graduate school) = $600,000 - $175,000 = $425,000 First 1.5 year salary = $100,000 (from above), minus expenses of $20,000. Next five year salary = $330,000 U(no grad-school) = $410,000 Is this a realistic model? What is missing? Go to Graduate School
The Yeaple Study (1994) Benefits of Learning School Net Value ($)Harvard $148,378Chicago $106,378Stanford $97,462MIT (Sloan) $85,736Yale $83,775Northwestern $53,526Berkeley $54,101Wharton $59,486UCLA $55,088Virginia $30,046Cornell $30,974Michigan $21,502Dartmouth $22,509Carnegie Mellon $18,679 Texas $17,459Rochester - $307Indiana - $3,315North Carolina - $4,565Duke - $17,631NYU - $3,749 According to Ronald Yeaple, it is only profitable to go to one of the top 15 Business Schools – otherwise you have a NEGATIVE NPV (net present value)! (Economist, Aug. 6, 1994)
Things he may have missed • Future uncertainty (interest rates, future salary, etc) • Cost of living differences • Type of Job [utility function = f($, enjoyment)] • Girlfriend / Boyfriend / Family concerns • Others? Utility Function = f ($, enjoyment, family, location, type of job, prestige, gender, age, race) Many Human Factors Considerations
Example – Joe’s Garage Joe’s garage is considering hiring another mechanic. The mechanic would cost them an additional $50,000 / year in salary and benefits. If there are a lot of accidents in Iowa City this year, they anticipate making an additional $100,000 in net revenue. If there are not a lot of accidents, they could lose $20,000 off of last year’s total net revenues. Because of all the ice on the roads, Joe thinks that there will be a 70% chance of “a lot of accidents” and a 30% chance of “fewer accidents”. Assume if he doesn’t expand he will have the same revenue as last year. Draw a decision tree for Joe and tell him what he should do.
Joe’s Decision - How to Decide? • What would be his decision based on: • Maximax? • Maximin? • Expected Return?
Quick primer on Statistics and Probability Definitions: Expected Value of x: E(x) = ; as P(x) represents the probability of x. (Note that = 1 and that the because P(x) represents a probability density function) Variance of x: Standard Deviation = the sq. root of the variance Median = “the center of the set of numbers”; or the point m such that P(x < m)< ½ and P(x > m)> ½ .
70% chance of an increase in accidents Profit = $90,000 Hire new mechanic Cost = $50,000 30% chance of a decrease in accidents Profit = - $20,000 Don’t hire new mechanic Cost = $0 Example - Answer .7 .3 • Estimated utility value of “Hire Mechanic”: EU =.7($90,000) + .3(- $20,000) - $50,000 = $7,000 • Therefore Joe should hire the mechanic using Maximax or Expected Return; and not hire using Maximin.
Problem: Jenny Lind Jenny Lind is a writer of romance novels. A movie company and a TV network both want exclusive rights to one of her more popular works. If she signs with the network, she will receive a single lump sum, but if she signs with the movie company, the amount she will receive depends on the market response to her movie. What should she do?
Payouts and Probabilities • Movie company Payouts • Small box office: $200,000 • Medium box office: $1,000,000 • Large box office: $3,000,000 • TV Network Payout • Flat rate: $900,000 • Probabilities • P(Small Box Office) = 0.3 • P(Medium Box Office) = 0.6 • P(Large Box Office) = 0.1
Using Expected Return Criteria EUmovie = 0.3(200,000)+0.6(1,000,000)+0.1(3,000,000) = $960,000 EUtv = $900,000 Therefore, using this criteria, Jenny should select the movie contract. EU = Expected Utility
Something to Remember Jenny’s decision is only going to be made one time, and she will earn either $200,000, $1,000,000 or $3,000,000 if she signs the movie contract, not the calculated expected return of $960,000!! Nevertheless, this amount is useful for decision-making, as it will maximize Jenny’s expected returns in the long run if she continues to use this approach.
Creating and Solving Decision Trees • Three types of “nodes” • Decision nodes - represented by squares (□) • Chance nodes - represented by circles (Ο) • Terminal nodes - represented by triangles (optional) • Solving the tree involves pruning all but the best decisions at decision nodes, and finding expected values of all possible states of nature at chance nodes • Create the tree from left to right • Solve the tree from right to left
Small Box Office Small Box Office $200,000 Medium Box Office Medium Box Office Sign with Movie Co. $1,000,000 Large Box Office Large Box Office $3,000,000 $900,000 Sign with TV Network $900,000 $900,000 Jenny Lind Decision Tree
Jenny Lind Decision Tree Small Box Office $200,000 .3 EU? Medium Box Office Sign with Movie Co. .6 $1,000,000 .1 Large Box Office EU? $3,000,000 Small Box Office $900,000 .3 EU? Sign with TV Network Medium Box Office .6 $900,000 .1 Large Box Office $900,000
Small Box Office EU= 960,000 $200,000 .3 Medium Box Office Sign with Movie Co. .6 $1,000,000 EU= 960,000 .1 Large Box Office $3,000,000 Small Box Office $900,000 EU= 900,000 .3 Sign with TV Network Medium Box Office .6 $900,000 .1 Large Box Office $900,000 Jenny Lind Decision Tree - Solved
Sam’s Car Deal Sam has the opportunity to buy a 1996 Spiffycar for $10,000, and he has a prospect who would be willing to pay $11,000 for the auto if it’s in excellent mechanical shape. Sam determines that everything except for the transmission is in excellent shape. If the transmission is bad, it will cost $3000 to fix it. He has a friend who can run a test on the transmission. The test is not always accurate: 30% of the time it judges a good transmission to be bad and 10% of the time it judges a good transmission to be good. Sam knows that 20% of the 1996 Spiffycars have bad transmission. Draw a decision tree for Sam and tell him what he should do.
Sam’s Car Deal T: Test judges that the transmission is bad. A: Transmission is good. P(T | A) = .3 P(T | not A) = .9 P(T) = P(T | A) P(A) + P(T | not A) P(not A) = (.3)(.8) + (.9)(.2) = .42 P(A | T) = P(T | A) P(A) / P(T) = (.3)(.8)/(.42) = .571429 Tran1: Test says “the transmission is bad” EU(Tran1) = (.571429)($11000) + (.428571)($8000) = $9722 Tran2: Test says “the transmission is good” EU(Tans2) = … = $10897
Sam’s Car Deal .42
Sam’s Car Deal Suppose Sam is in the same situation as the previous example, except that the test is not free. Rather, it costs $200. So Sam must decide whether to run the test, buy the car without running the car, or keep his $10,000. Draw a decision tree for Sam and tell him what he should do.
Sam’s Car Deal EU(Tran1) = $9800 EU(Tran2) = $10,697 EU(Test) = $10320 EU(Tran3) = (.8)$11000 + (.2)$8000 = $10400
Mary’s Factory Mary is a manager of a gadget factory. Her factory has been quite successful the past three years. She is wondering whether or not it is a good idea to expand her factory this year. The cost to expand her factory is $1.5M. If she does nothing and the economy stays good and people continue to buy lots of gadgets she expects $3M in revenue; while only $1M if the economy is bad. If she expands the factory, she expects to receive $6M if economy is good and $2M if economy is bad. She also assumes that there is a 40% chance of a good economy and a 60% chance of a bad economy. Draw a Decision Tree showing these choices.
40 % Chance of a Good Economy Profit = $6M Expand Factory Cost = $1.5 M 60% Chance Bad Economy Profit = $2M Good Economy (40%) Profit = $3M Don’t Expand Factory Cost = $0 Bad Economy (60%) Profit = $1M Decision Tree Example .4 .6 .4 .6 EUExpand = (.4(6) + .6(2)) – 1.5 = $2.1M EUNo Expand = .4(3) + .6(1) = $1.8M $2.1 > 1.8, therefore Mary should expand the factory
Mary’s Factory – With Options A few days later she was told that if she expands, she can opt to either (a) expand the factory further, which costs 1.5M, but will yield an additional $2M in profit when economy is good but only $1M when economy is bad, (b) abandon the project and sell the equipment she originally bought for $1.3M, or (c) do nothing. Draw a decision tree to show these three options for each possible outcome, and compute the NPV for the expansion.
Good Market Decision Trees with Options Expand further – yielding $8M (but costing $1.5) .4 Stay at expanded levels – yielding $6M Reduce to old levels – yielding $3M (but saving $1.3 - sell equipment) .6 Expand further – yielding $3M (but costing $1.5) Bad Market Stay at expanded levels – yielding $2M Reduce to old levels – yielding $1M (but saving $1.3 in equipment cost)
Expected Return Value of the Options • Good Economy • Expand further = 8M – 1.5M = 6.5M • Do nothing = 6M • Abandon Project = 3M + 1.3M = 4.3M • Bad Economy • Expand further = 3M – 1.5M = 1.5M • Do nothing = 2M • Abandon Project = 1M + 1.3M = 2.3M
Expected Return of the Project So the EU of Expanding the factory further is: EUExpand = [.4(6.5) + .6(2.3)] - 1.5M = $2.48M Therefore the value of the option is 2.48 (new EU) – 2.1 (old EU) = $380,000 You would pay up to this amount to exercise that option.
Decision Making: Umbrella Network If the forecast is rainy, should I take my umbrella?? take/don’t take P(rain) = 0.4 umbrella weather have umbrella forecast P(have|take) = 1.0 P(~have|~take)=1.0 happiness f w p(f|w) sunny rain 0.3 rainy rain 0.7 sunny no rain 0.8 rainy no rain 0.2 U(have,rain) = -25 U(have,~rain) = 0 U(~have, rain) = -100 U(~have, ~rain) = 100