1 / 44

Lecture Eleven

Lecture Eleven. Probability Models. Outline. Bayesian Probability Duration Models. Bayesian Probability. Facts Incidence of the disease in the population is one in a thousand The probability of testing positive if you have the disease is 99 out of 100

serena
Download Presentation

Lecture Eleven

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture Eleven Probability Models

  2. Outline • Bayesian Probability • Duration Models

  3. Bayesian Probability • Facts • Incidence of the disease in the population is one in a thousand • The probability of testing positive if you have the disease is 99 out of 100 • The probability of testing positive if you do not have the disease is 2 in a 100

  4. Joint and Marginal Probabilities

  5. Filling In Our Facts

  6. Using Conditional Probability • Pr(+ H)= Pr(+/H)*Pr(H)= 0.02*0.999=.01998 • Pr(+ S) = Pr(+/S)*Pr(S) = 0.99*0.001=.00099

  7. Filling In Our Facts

  8. By Sum and By Difference

  9. False Positive Paradox • Probability of Being Sick If You Test + • Pr(S/+) ? • From Conditional Probability: • Pr(S/+) = Pr(S +)/Pr(+) = 0.00099/0.02097 • Pr(S/+) = 0.0472

  10. Bayesian Probability By Formula • Pr(S/+) = Pr(S +)/Pr(+) = PR(+/S)*Pr(S)/Pr(+) • Where PR(+) = PR(+/S)*PR(S) + PR(+/H)*PR(H) • And Using our facts;Pr(S/+) = 0.99*(0.001)/[0.99*.001 + 0.02*.999] • Pr(S/+) = 0.00099/[0.00099+0.01998] • Pr(S/+) = 0.00099/0.02097 = 0.0472

  11. Duration Models • Exploratory (Graphical) Estimates • Kaplan-Meier • Functional Form Estimates • Exponential Distribution

  12. Duration of Post-War Economic Expansions in Months

  13. Estimated Survivor Function for Ten Post-War Expansions

  14. Kaplan-Meyer Estimate of Survivor Function • Survivor Function = (# at risk - # ending)/# at risk

  15. Exponential Distribution • Density: f(t) = exp[ - t], 0 t • Cumulative Distribution Function F(t) • F(t) = • F(t) = - exp[- u] • F(t) = -1 {exp[- t] - exp[0]} • F(t) = 1 - exp[- t] • Survivor Function, S(t) = 1- F(t) = exp[- t] • Taking logarithms, lnS(t) = - t

  16. So l = 0.022

  17. Exponential Distribution (Cont.) • Mean = 1/ = • Memoryless feature: • Duration conditional on surviving until t = : • DURC( ) = = + 1/ • Expected remaining duration = duration conditional on surviving until time , i.e DURC, minus • Or 1/ , which is equal to the overall mean, so the distribution is memoryless

  18. Exponential Distribution(Cont.) • Hazard rate or function, h(t) is the probability of failure conditional on survival until that time, and is the ratio of the density function to the survivor function. It is a constant for the exponential. • h(t) = f(t)/S(t) = exp[- t]/exp[- t] =

  19. Model Building • Reference: Ch 20

  20. 20.2 Polynomial Models • There are models where the independent variables (xi) may appear as functions of a smaller number of predictor variables. • Polynomial models are one such example.

  21. Polynomial Models with One Predictor Variable y = b0 + b1x1+ b2x2 +…+ bpxp + e y = b0 + b1x + b2x2 + …+bpxp + e

  22. b2 < 0 b2 > 0 Polynomial Models with One Predictor Variable • First order model (p = 1) • y = b0 + b1x+ e • Second order model (p=2) y = b0 + b1x + b2x2+ e

  23. b3 < 0 b3 > 0 Polynomial Models with One Predictor Variable • Third order model (p = 3) y = b0 + b1x + b2x2+e b3x3 + e

  24. y x1 x2 Polynomial Models with Two Predictor Variables y b1 > 0 • First order modely = b0 + b1x1 + e b2x2 + e b1 < 0 x1 x2 b2 > 0 b2 < 0

  25. 20.3 Nominal Independent Variables • In many real-life situations one or more independent variables are nominal. • Including nominal variables in a regression analysis model is done via indicator variables. • An indicator variable (I) can assume one out of two values, “zero” or “one”. 1 if the temperature was below 50o 0 if the temperature was 50o or more 1 if a first condition out of two is met 0 if a second condition out of two is met 1 if data were collected before 1980 0 if data were collected after 1980 1 if a degree earned is in Finance 0 if a degree earned is not in Finance I=

  26. Nominal Independent Variables; Example: Auction Car Price (II) • Example 18.2 - revised (Xm18-02a) • Recall: A car dealer wants to predict the auction price of a car. • The dealer believes now that odometer reading and the car color are variables that affect a car’s price. • Three color categories are considered: • White • Silver • Other colors • Note: Color is a nominal variable.

  27. Nominal Independent Variables; Example: Auction Car Price (II) • Example 18.2 - revised (Xm18-02b) 1 if the color is white 0 if the color is not white I1 = 1 if the color is silver 0 if the color is not silver I2 = The category “Other colors” is defined by: I1 = 0; I2 = 0

  28. How Many Indicator Variables? • Note: To represent the situation of three possible colors we need only two indicator variables. • Conclusion: To represent a nominal variable with m possible categories, we must create m-1 indicator variables.

  29. Nominal Independent Variables; Example: Auction Car Price • Solution • the proposed model is y = b0 + b1(Odometer) + b2I1 + b3I2 + e • The data White car Other color Silver color

  30. Price 16996.48 - .0555(Odometer) 16791.48 - .0555(Odometer) 16701 - .0555(Odometer) Odometer Example: Auction Car Price The Regression Equation From Excel (Xm18-02b) we get the regression equation PRICE = 16701-.0555(Odometer)+90.48(I-1)+295.48(I-2) The equation for a silver color car. Price = 16701 - .0555(Odometer) + 90.48(0) + 295.48(1) The equation for a white color car. Price = 16701 - .0555(Odometer) + 90.48(1) + 295.48(0) Price = 16701 - .0555(Odometer) + 45.2(0) + 148(0) The equation for an “other color” car.

  31. Example: Auction Car Price The Regression Equation From Excel we get the regression equation PRICE = 16701-.0555(Odometer)+90.48(I-1)+295.48(I-2) For one additional mile the auction price decreases by 5.55 cents. A white car sells, on the average, for $90.48 more than a car of the “Other color” category A silver color car sells, on the average, for $295.48 more than a car of the “Other color” category.

  32. There is insufficient evidence to infer that a white color car and a car of “other color” sell for a different auction price. There is sufficient evidence to infer that a silver color car sells for a larger price than a car of the “other color” category. Example: Auction Car Price The Regression Equation Xm18-02b

  33. Nominal Independent Variables; Example: MBA Program Admission (MBA II) • Recall: The Dean wanted to evaluate applications for the MBA program by predicting future performance of the applicants. • The following three predictors were suggested: • Undergraduate GPA • GMAT score • Years of work experience • It is now believed that the type of undergraduate degree should be included in the model. Note: The undergraduate degree is nominal data.

  34. Nominal Independent Variables; Example: MBA Program Admission (II) 1 if B.A. 0 otherwise I1 = 1 if B.B.A 0 otherwise I2 = 1 if B.Sc. or B.Eng. 0 otherwise I3 = The category “Other group” is defined by: I1 = 0; I2 = 0; I3 = 0

  35. Nominal Independent Variables; Example: MBA Program Admission (II) MBA-II

  36. 20.4 Applications in Human Resources Management: Pay-Equity • Pay-equity can be handled in two different forms: • Equal pay for equal work • Equal pay for work of equal value. • Regression analysis is extensively employed in cases of equal pay for equal work.

  37. Human Resources Management: Pay-Equity • Solution • Construct the following multiple regression model:y = b0 + b1Education + b2Experience + b3Gender + e • Note the nature of the variables: • Education – Interval • Experience – Interval • Gender – Nominal (Gender = 1 if male; =0 otherwise).

  38. Human Resources Management: Pay-Equity • Solution – Continued (Xm20-03) • Analysis and Interpretation • The model fits the data quite well. • The model is very useful. • Experience is a variable strongly related to salary. • There is no evidence of sex discrimination.

  39. Human Resources Management: Pay-Equity • Solution – Continued (Xm20-03) • Analysis and Interpretation • Further studying the data we find: Average experience (years) for women is 12. Average experience (years) for men is 17 • Average salary for female manager is $76,189 Average salary for male manager is $97,832

  40. Midterm Grade Distribution • A: 68- 7 • A-: 65-67 7 • B+: 61-64 9 • B: -59 7 • total 30

  41. Midterm Grade distribution: Normal Distribution If you scored above the median, A- or A otherwise B or B+

More Related