130 likes | 227 Views
Stat 112 -- Notes 5. p-values for one-sided tests Caution about forecasting outside the range of the explanatory variable (Chapter 3.7.2) Fitting a linear trend to Time Series Data (Chapter 3.6) Practice Question
E N D
Stat 112 -- Notes 5 • p-values for one-sided tests • Caution about forecasting outside the range of the explanatory variable (Chapter 3.7.2) • Fitting a linear trend to Time Series Data (Chapter 3.6) • Practice Question • Quiz on Tuesday will last first thirty minutes of class. I will send out review materials tonight or tomorrow morning.
P-Values for One and Two Sided Tests • Two-sided test: Reject if p-value = Prob>|t| reported in JMP under parameter estimates. • One-sided test I: Reject if p-value = (Prob>|t|)/2 if t is negative 1-(Prob>|t|)/2 if t is positive
P-values for One and Two Sided Tests Continued • One-sided test II: Reject if p-value = (Prob>|t|)/2 if t is positive 1-(Prob>|t|)/2 if t is negative
Forecasting Outside the Range of the Explanatory Variable (Extrapolation) • When constructing estimates of or predicting individual values of a Y based on , caution must be used if is outside the range of the observed x’s. The data does not provide information about whether the simple linear regression model continues to hold outside of the range of the observed x’s. • Prediction intervals only account for (1) variability in Y given X; (2) uncertainty in the estimates of the slope and intercept given that the simple linear regression model is true. When is outside the range of the observed x’s, the prediction interval might not be accurate.
Olympic Long Jump: Length of gold medal jump (Y) vs. Year (X)
Predictions from Long Jump Simple Linear Regression Model • Predicted Olympic gold medal winning long jumps: • 2008 (Beijing): -78.42771+0.053557*2008 = 29.11 feet • 2028: -78.42771+0.053557*2028 = 30.19 feet • 3000: -78.42771+0.053557*3000 = 82.24 feet • 95% Prediction Interval for Year 3000: =(70.32, 94.17) Prediction interval is not reasonable! Predicting winning distance for year 3000 is an extrapolation
Time Series Data (Chapter 3.6) • Cross-sectional data: Data gathered on a different individuals at the same point in time. • Time series: Data gathered on a single individual (person, firm, so on) over a sequence of time periods which may be days, weeks, months, quarters, years or any other measure of time. • One goal in analyzing time series is to understand the trend in Y over time: E(Y|Time), i.e., we treat Time as our explanatory variable in the regression analysis. • Simple Linear Regression Model for Trend in Time Series Data:
Hurricane Data • Is there a trend in the number of hurricanes in the Atlantic over time (possibly an increase because of global warming)? • hurricane.JMP contains data on the number of hurricanes in the Atlantic basin from 1950-1997.
Inferences for Hurricane Data • Residual plots and normal quantile plots indicate that assumptions of linearity, constant variance and normality in simple linear regression model are reasonable. • 95% confidence interval for slope (change in mean hurricanes between year t and year t+1): (-0.086,0.012) • Hypothesis Test of null hypothesis that slope equals zero: test statistic = -1.52, p-value =0.13. We accept since p-value > 0.05. No evidence of a trend in hurricanes from 1950-1997.