640 likes | 855 Views
Power 16. Review. Post-Midterm Cumulative. Projects. Logistics. Put power point slide show on a high density floppy disk, or e-mail as an attachment, for a WINTEL machine. Email Llad@econ.ucsb.edu the slide-show as a PowerPoint attachment. Assignments. 1. Project choice
E N D
Review • Post-Midterm • Cumulative
Logistics • Put power point slide show on a high density floppy disk, or e-mail as an attachment, for a WINTEL machine. • Email Llad@econ.ucsb.edu the slide-show as a PowerPoint attachment
Assignments • 1. Project choice • 2. Data Retrieval • 3. Statistical Analysis • 4. PowerPoint Presentation • 5. Executive Summary • 6. Technical Appendix • 7. Graphics Power_13
PowerPoint Presentations: Member 4 • 1. Introduction: Members 1 ,2 , 3 • What • Why • How • 2. Executive Summary: Member 5 • 3. Exploratory Data Analysis: Member 3 • 4. Descriptive Statistics: Member 3 • 5. Statistical Analysis: Member 3 • 6. Conclusions: Members 3 & 5 • 7. Technical Appendix: Table of Contents, Member 6
Technical Appendix • Table of Contents • Spreadsheet of data used and sources or if extensive, a subsample of the data • Descriptive Statistics and Histograms for the variables in the study • If time series data, a plot of each variable against time • If relevant, plot of the dependent Vs. each of the explanatory variables
Technical Appendix (Cont.) • Statistical Results, for example regression • Plot of the actual, fitted and error and other diagnostics • Brief summary of the conclusions, meanings drawn from the exploratory, descriptive, and statistical analysis.
Post-Midterm Review • Project I: Power 16 • Contingency Table Analysis: Power 14, Lab 8 • ANOVA: Power 15, Lab 9 • Survival Analysis: Power 12, Power 11, Lab 7 • Multi-variate Regression: Power 11 , Lab 6
Slide Show • Challenger disaster
Project I • Number of O-Rings Failing On Launch i: yi(#) = a + b*tempi + ei • Biased because of zeros, even if divide equation by 6 • Two Ways to Proceed • Tobit, non-linear estimation: yi(#) = a + b*tempi + ei • Bernoulli variable: probability models • Probability Models: yi(0,1) = a + b*tempi + ei
Project I (Cont.) • Probability Models: yi(0,1) = a + b*tempi + ei • OLS, Linear Probability Model, linear approximation to the sigmoid • Probit, non-linear estimate of the sigmoid • Logit, non-linear estimate of the sigmoid • Significant Dependence on Temperature • t-test (or z-test) on slope, H0 : b=0 • F-test • Wald test
Project I (Cont.) • Plots of Number or Probability Vs Temp. • Label the axes • Answer all parts, a-f • The most frequent sins • Did not explicitly address significance • Did not answer b, 660 : all launches at lower temperatures had one or more o-ring failures • Did not execute c, estimate linear probability model
Challenger Disaster • Failure of O-rings that sealed grooves on the booster rockets • Was there any relationship between o-ring failure and temperature? • Engineers knew that the rubber o-rings hardened and were less flexible at low temperatures • But was there launch data that showed a problem?
Challenger Disaster • What: Was there a relationship between launch temperature and o-ring failure prior to the Challenger disaster? • Why: Should the launch have proceeded? • How: Analyze the relationship between launch temperature and o-ring failure
Launches Before Challenger • Data • number of o-rings that failed • launch temperature
Exploratory Analysis • Launches where there was a problem
Orings temperature 1 58 1 57 1 70 1 63 1 70 2 75 3 53
Exploratory Analysis • All Launches Plot of failures per observation versus temperature range shows temperature dependence: Mean temperature for the 7 launches with o-ring failures was lower, 63.7, than for the 17 launches without o-ring failures, 72.6. - Contingency table analysis
Launches and O-Ring Failures Chi-Square, 2dof=9.08, crit(=0.05)=6
Logit Extrapolated to 31F: Probit extrapolated to 31F:
Conclusions • From extrapolating the probability models to 31 F, Linear Probability, Probit, or Logit, there was a high probability of one or more o-rings failing • From extrapolating the Number of O-rings failing to 31 F, OLS or Tobit, 3 or more o-rings would fail. • There had been only one launch out of 24 where as many as 3 o-rings had failed. • Decision theory argument: expected cost/benefit ratio:
Conclusions • Decision theory argument: expected cost/benefit ratio:
Ways to Analyze Challenger Difference in mean temperatures for failures and successes Difference in probability of one or more o-ring failures for high and low temperature ranges Probabilty models: LPM (OLS), probit, logit Number of o-ring failure per launch Vs. Temp. OLS, Tobit Contingency table analysis ANOVA
Contingency Table Analysis • Challenger example
ANOVA and O-Rings • Probability one or more o-rings fail • Low temp: 53-62 degrees • Medium temp: 63-71 degrees • High temp: 72-81 degrees • Average number of o-rings failing per launch • Low temp: 53-62 degrees • Medium temp: 63-71 degrees • High temp: 72-81 degrees
Outline • ANOVA and Regression • (Non-Parametric Statistics) • (Goodman Log-Linear Model)
Anova and Regression: One-Way • Salesaj = c(1)*convenience+c(2)*quality+c(3)*price+ e • E[salesaj/(convenience=1, quality=0, price=0)] =c(1) = mean for city(1) • c(1) = mean for city(1) (convenience) • c(2) = mean for city(2) (quality) • c(3) = mean for city(3) (price) • Test the null hypothesis that the means are equal using a Wald test: c(1) = c(2) = c(3)
One-Way ANOVA and Regression Regression Coefficients are the City Means; F statistic
Anova and Regression: One-WayAlternative Specification • Salesaj = c(1) + c(2)*convenience+c(3)*quality+e • E[Salesaj/(convenience=0, quality=0)] = c(1) = mean for city(3) (price, the omitted one) • E[Salesaj/(convenience=1, quality=0)] = c(1) + c(2) = mean for city(1) (convenience) • c(1) = mean for city(3), the omitted city • c(2) = mean for city(1) minus mean for city(3) • Test that the mean for city(1) = mean for city(3) • Using the t-statistic for c(2)
Anova and Regression: One-WayAlternative Specification • Salesaj = c(1) + c(2)*convenience+c(3)*price+e • E[Salesaj/(convenience=0, price=0)] = c(1) = mean for city(2) (quality, the omitted one) • E[Salesaj/(convenience=1, price=0)] = c(1) + c(2) = mean for city(1) (convenience) • c(1) = mean for city(2), the omitted city • c(2) = mean for city(1) minus mean for city(2) • Test that the mean for city(1) = mean for city(2) • Using the t-statistic for c(2)
ANOVA and Regression: Two-WaySeries of Regressions; Compare to Table 11, Lecture 15 • Salesaj = c(1) + c(2)*convenience + c(3)* quality + c(4)*television + c(5)*convenience*television + c(6)*quality*television + e, SSR=501,136.7 • Salesaj = c(1) + c(2)*convenience + c(3)* quality + c(4)*television + e, SSR=502,746.3 • Test for interaction effect: F2, 54 = [(502746.3-501136.7)/2]/(501136.7/54) = (1609.6/2)/9280.3 = 0.09
ANOVA and Regression: Two-WaySeries of Regressions • Salesaj = c(1) + c(2)*convenience + c(3)* quality + e, SSR=515,918.3 • Test for media effect: F1, 54 = [(515918.3-502746.3)/1]/(501136.7/54) = 13172/9280.3 = 1.42 • Salesaj = c(1) +e, SSR = 614757 • Test for strategy effect: F2, 54 = [(614757-515918.3)/2]/(501136.7/54) = (98838.7/2)/(9280.3) = 5.32
Survival Analysis • Density, f(t) • Cumulative distribution function, CDF, F(t) • Probability you failed up to time t* =F(t*) • Survivor Function, S(t) = 1-F(t) • Probability you survived longer than t*, S(t*) • Kaplan-Meier estimates: (#at risk- # ending)/# at risk • Applications • Testing a new drug
Chemotherapy Drug Taxol • Current standard for ovarian cancer is taxol and a platinate such as cisplatin • Previous standard was cyclophosphamide and cisplatin • Kaplan-Meier Survival curves comparing the two regimens • Lab 7: ( # at risk- #ending)/# at riak