460 likes | 944 Views
CHAPTER 7. Subjective Probability and Bayesian Inference. 7. 1. Subjective Probability. Personal evaluation of probability by individual decision maker Uncertainty exists for decision maker: probability is just a way of measuring it
E N D
CHAPTER 7 Subjective Probability and Bayesian Inference
7. 1. Subjective Probability • Personal evaluation of probability by individual decision maker • Uncertainty exists for decision maker: probability is just a way of measuring it • In dealing with uncertainty, a coherent decision maker effectively uses subjective probability
7.2. Assessment of Subjective Probabilities Simplest procedure: • Specify the set of all possible events, • ask the decision maker to directly estimate probability of each event • Not a good approach from psychological point of view • Not easy to conceptualize, especially for DM not familiar with probability
Standard Device • Physical instrument or conceptual model • Good tool for obtaining subjective probabilities • Example • A box containing 1000 balls • Balls numbered 1 to 1000 • Balls have 2 colors: red, blue
Standard Device example • To estimate a students’ subjective probability of getting an “A” in SE 447, • we ask him to choose between 2 bets: • Bet X: If he gets an A, he win SR 100 If he doesn’t get A, he wins nothing • Bet Y: If he picks a red ball, he win SR 100 If he picks a blue ball, he wins nothing • We start with proportion of red balls P = 50%, then adjust successively until 2 bets are equal
Other standard devices Pie diagram (spinner) Circle divided into 2 sectors: 1 red 1 blue • Bet Y: If he spins to red section, he win SR 100 If he spins to blue section, he wins nothing • Size of red section is adjusted until 2 bets are equal
Subjective Probability Bias • Standard device must be easy to perceive, to avoid introducing bias 2 kinds of bias: • Task bias: resulting from assessment method (standard device) • Conceptual bias: resulting from mental procedures (heuristics) used by individuals to process information
Mental Heuristics Causing Bias • Representativeness • If x highly represents set A, high probability is given that X A • Frequency (proportion) ignored • Sample size ignored • Availability • Limits of memory and imagination • Adjustment & anchoring • Starting from obvious reference point, then adjusting for new values. • Anchoring: adjustment is typically not enough • Overconfidence • Underestimating variance
Fractile Probability Assessment • Quartile Assessment: • Determine 3 values • x1, for which p(x > x1) = 0.5 • x2, for which p(x < x2) = p(x2 < x< x1) • x3, for which p(x > x3) = p(x1 < x< x3) • x x2 x1 x3 • F(x) 0.25 0.5 0.75
Fractile Probability Assessment • Quartile Assessment: 4 intervals • Octile Assessment: 8 intervals • Tertile Assessment: 3 intervals. avoids anchoring at the median
Histogram Probability Assessment • Fix the points x1, x2 , …, xm. • Ask the decision maker to assess probabilities p(x1 < x< x2) p(x2 < x< x3) … x1 x2 x3 … x Gives probability distribution (not cumulative p.d. as fractile method)
Assessment Methods & Bias • No evidence to favor either fractile or histogram methods • One factor that reinforces anchoring bias is self-consistency • Bias can be reduced by “pre-assessment conditioning”: training, for/against arguments • The act of probability assessment causes a re-evaluation of uncertainty
7.3. Impact of New Information(Bayes’ Theorem) • After developing subjective probability distribution • Assume new information becomes available Example: new data is collected • According to coherence principle, DM must take new information in consideration, thus Subjective probability must be revised • How? using Bayes’ theorem
Bayes’ Theorem Example • Suppose your subjective probability distribution for weather tomorrow is: • chances of being sunny P(S) = 0.6 • chances of being not sunny P(N) = 0.4 • If the TV weather forecast predicted a cloudy day tomorrow. How should you change P(S)? • Assume we are dealing with mutually exclusive and collectively exhaustive events such as sunny or not sunny.
Impact of Information • We assume the weather forecaster predicts either • cloudy day C, or • bright day B. • To change P(S), we use the joint probability = conditional probability * marginal probability • P(C,S) = P(C|S)P(S) P(B,S) = P(B|S)P(S) • P(C,N) = P(C|N)P(N) P(B,N)= P(B|N)P(N)
Impact of Information • To obtain j.p.m.f we need the conditional probabilities P(C|S) and P(C|N). • These can be obtained from historical data. How? • In past 100 sunny days, cloudy forecast in 20 days P(C|S) = 0.2 P(B|S) = 0.8 • In past 100 cloudy days, cloudy forecast in 90 days P(C|N) = 0.9 P(B|N) = 0.1
Joint probabilityCalculations • Joint probability P(A,B) = conditional probability (likelihood) P(A|B) * marginal probability P(B) • Cloudy forecast • P(C,S) = P(C|S)P(S) = 0.2(0.6) = 0.12 • P(C,N) = P(C|N)P(N) = 0.9(0.4) = 0.36 • Sunny forecast • P(B,S) = P(B|S)P(S) = 0.8(0.6) = 0.48 • P(B,N) = P(B|N)P(N) = 0.1(0.4) = 0.04
Bayes’ Theorem S N C P(C,S) P(C,N)P(C) B P(B,S)P(B,N) P(B) P(S) P(N) • P(S|C) = P(C,S) / P(C) = P(C|S)P(S ) / P(C) • P(S|C) P(C|S)P(S )
Joint probabilityTable S N C 0.12 0.36 0.48 B 0.48 0.04 0.52 0.6 0.4 • P(S|C) = 0.12/0.48 = 0.25 posterior (conditional) probability • Compare to P(S) = 0.6 prior (marginal) prob. • P(S) decreased because of C forecast
Prior and Posterior Probabilities • Prior means before. Prior probability is the probability P(S) before the information was heard. • Posterior means after. It is probability obtained after incorporating the new forecast information. It is P(S|C). It is obtained using Bayes’ theorem.
Example with 3 states 3 demand possibilities for new product • High P(H) = 0.6 • Medium P(M) = 0.1 • Low P(L) = 0.3 Market research gives Average result: • 30% of time if true demand is High • 50% of time if true demand is Medium • 90% of time if true demand is Low
Example : Probability Table for Average result State Prior Likelihood Joint Posterior S P(S) P(A|S) P(S,A) P(S|A) H 0.6 0.3 0.18 0.36 M 0.1 0.5 0.05 0.10 L 0.3 0.9 0.27 0.54 1.0 0.50 1.00 • If market research gives Average result: P(H), P(L), P(M)
Ex: Sequential Bayesian Analysis • An oil company has 3 drilling sites: X, Y, Z 3 possible reserve states: • No reserves P(N) = 0.5 • Small reserves P(S) = 0.3 • Large reserves P(L) = 0.2 3 possible drilling outcomes: • Dry D • Wet W • Gushing G
Ex: Sequential Bayesian Analysis If reserves are: • None (N) all wells will be dry (D) • Large (L) all wells will be Gushing (G) • Small (S) some Dry (D) and some wet (W) wells P(1D/1) = 0.8 P(2D/2) = 0.2 P(3D/3) = 0 • All sites are equally favorable • Assume order of drilling is: XYZ • Notation: DX = probability of Dry well at site X
Ex: Probability Table for Site X State Prior Conditional Joint K P(K) P(D|K) P(W|K) P(G|K) DX WX GX N 0.5 1 0 0 0.5 0 0 S 0.3 0.8 0.2 0 0.24 0.06 0 L 0.2 0 0 1 0 0 0.2 1.0 0.74 0.06 0.2 • Stop exploratory drilling at X if you get: W: Reserves are S, or G: Reserves are L • If you get D, drill at Y.. (Res. N: P = 0.5/0.74 = 0.68, S: P = 0.24/0.74 = 0.32)
Ex: Probability Table for Site Y State Prior Conditional Joint K P(K) P(D|K) P(W|K) DY WY N 0.68 1 0 0.68 0 S 0.32 0.5 0.5 0.16 0.16 = 0.4/0.8 1.0 0.84 0.16 • Stop exploratory drilling at Y if you get W: Reserves are S • If you get D, drill at Z (Res. N: P = 0.68/0.84 = 0.81, S: P = 0.16/0.84 = 0.19)
Ex: Probability Table for Site Z State Prior Conditional Joint K P(K) P(D|K) P(W|K) DZ WZ N 0.81 1 0 0.81 0 S 0.19 0 1 0 0.19 1.0 0.81 0.19 • If you get W: Reserves are S • If you get D: Reserves are N
Ex: Change in P(N) Prior Probability • Before drilling P(N) = 0.5 Posterior Probabilities • After 1 Dry P(N|Dx) = 0.68 • After 2 Dry P(N|Dx, DY) = 0.81 • After 3 Dry P(N|Dx, DY, DZ) = 1
7.4. Conditional Independence Two events, A and B, are independent iff: • P(A, B) = P(A)*P(B) Implying • P(A|B) = P(A) • P(B|A) = P(B) • Posterior probability is the same the prior • New information about 1 event does not affect the probability of the other
Conditional Independence Two events, A and B, are conditionally independent iff: • P(A, B) P(A)*P(B) But their conditional probabilities on a 3rd event, C, are independent • P(A, B|C) = P(A|C)*P(B|C) • Useful property in Bayesian analysis
Ex: Horse Race Probability Horse named WR will race at 3:00 p.m. Probability of WR winning P(WR) depends on track condition • Firm (F) P(F) = 0.3 P(WR|F) = 0.9 • Soft (S) P(S) = 0.7 P(WR|S) = 0.2
Ex: Horse Race Probability • Given the results of 2 previous races • At 1:30, horse named MW won the race P(MW|F) = 0.8 P(MW|S) = 0.4 • At 2:00, horse named AJ won the race P(AJ|F) = 0.9 P(AJ|S) = 0.5 • What the new WR win probability P(WR|MW, AJ)?
Ex: Horse Race Probability • P(WR) wins given MW and AJ have won must sum conditional probabilities of both possible track conditions: F or S • P(WR|MW, AJ) = P(WR|F) * P(F|MW, AJ) + P(WR|S) * P(S|MW, AJ)
Ex: Horse Race Probability • The 2 events MW and AJ are conditionally independent with respect to a 3rd event: track condition F or S • Recall: P(A|B) = P(B|A)P(A)/P(B) P(A|B) P(B|A)P(A) • P(F|MW, AJ) P(MW, AJ|F) P(F) P(MW|F) P(AJ|F) P(F) • P(S|MW, AJ) P(MW, AJ|S) P(S) P(MW|S) P(AJ|S) P(S)
Ex: Horse Race Probability • P(F|MW, AJ) P(MW|F) P(AJ|F) P(F) 0.8 * 0.9 * 0.3 = 0.216 • P(S|MW, AJ) P(MW|S) P(AJ|S) P(S) 0.4 * 0.5 * 0.7 = 0.14 Normalizing P(F|MW, AJ) = 0.216/(0.216 + 0.14) = 0.61 P(F|MW, AJ) = 0. 14/(0.216 + 0.14) = 0.39
Ex: Horse Race Probability • Substituting into • P(WR|MW, AJ) = P(WR|F) * P(F|MW, AJ) + P(WR|S) * P(S|MW, AJ) = 0.9(0.61) + 0.2(0.39) = 0.63
7.5. Bayesian Updating with Functional Likelihoods • Posterior probability P(A|B) = P(A,B )/P(B) =P(B|A)P(A)/P(B) • Conditional probability (likelihood) P(B|A) can be described by particular probability distribution: • (1) Binomial • (2) Normal
Binomial likelihood distribution • Dichotomous (2-value) data: defective, not • Sequence of dichotomous outcomes: series of quality tests • In each test, constant probability (p) of one of the 2 outcomes: defective • Outcome of each test is independent of others • Total number (r) of one kind of outcomes (defective) out of (n) tests is • f(r|n, p) = Given in tables
Binomial likelihood example Demand for new product can be: • High (H) P(H) = 0.2 • Medium (M) P(M) = 0.3 • Low (L) P(L) = 0.5 For each case, the probability (p) that an individual customer buys the product is • H: p = 0.25 • M: p = 0.1 • L: p = 0.05 In random sample of 5 customers, 1 will buy
Binomial likelihood example For (n = 5, r = 1), likelihoods are obtained from table, or calculated by: • H: [5!/(4!*1)]0.25(0.75)4 = 0.3955 • M: [5!/(4!*1)]0.1(0.9)4 = 0.3281 • L: [5!/(4!*1)]0.05(0.95)4 = 0.2036 The joint probability table can now be constructed
Binomial Likelihood Example State Prior Likelihood Joint Posterior p P(p) P(1|5,p) P(p,1/5) P(p|1/5) H: 0.25 0.2 0.3955 0.0791 0.2832 M : 0.1 0.3 0.3281 0.0984 0.3523 L: 0.05 0.5 0.2036 0.1018 0.3645 1.0 0.2793 1.00 • If 1 in a sample of 5 customers buys: P(H), P(M), P(L)
Normal likelihood distribution • Most common, symmetric, • Continuous data, can approximate discrete • f(y|, ) = • Given in tables. Formula usually not used. • Two parameters: mean () and standard deviation ().
Normal likelihood Updating • Mean () and standard deviation () Can be updated individually (assuming one is known) or together • P(|y, ) f(y|, ) P(|) • P(|y, ) f(y|, ) P(|) • P(, |y) f(y|, ) P(, )
Normal likelihood example Updating Mean () Average weight setting has 2 possibilities: • High (H) P(H) = 0.5 = 8.2, = 0.1 • Low (L) P(L) = 0.5 = 7.9, = 0.1 A sample of 1 bottle has weight = 8.0 oz. What is the posterior probability of H and L?
Normal likelihood example • Likelihood values are obtained from table, or calculated by • If = 8.2 Z = (8.0 – 8.2)/0.1 = – 2 f(8|, ) = 0.054 • If = 7.9 Z = (8.0 – 7.9)/0.1 = 1 f(8|, ) = 0.242
Normal Likelihood Example State Prior Likelihood Joint Posterior P() P(8|, ) P(8,,) P( |8) H: 8.2 0.5 0.054 0.027 0.18 L: 7.9 0.5 0.242 0.121 0.82 1.0 0.148 1.00 • After a sample of 1 bottle with weight = 8: P(H), P(L)