470 likes | 506 Views
Testing Individual Coefficients. Test of Slope Coefficient p. Tests if there is a Linear Relationship Between one X & Y Involves one single population Slope p Hypotheses: H 0 : p = 0 vs. H a : p 0. Test of Slope Coefficient p Test Statistic.
E N D
Testing Individual Coefficients EPI809/Spring 2008
Test of Slope Coefficient p • Tests if there is a Linear Relationship Between one X & Y • Involves one single population Slope p • Hypotheses: H0: p = 0 vs. Ha: p 0 EPI809/Spring 2008
Test of Slope Coefficient pTest Statistic EPI809/Spring 2008
Test of Slope Coefficient Rejection Rule • Reject H0 in favor of Ha if t falls in colored area • Reject H0 for Ha if P-value = 2P(T>|t|)<α Reject H Reject H 0 0 α/2 α/2 T=t(n-k-1) 0 t1-α/2(n-k-1) -t1-α/2(n-k-1) EPI809/Spring 2008
Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 0.06397 0.25986 0.25 0.8214 Food 1 0.20492 0.05882 3.48 0.0399 weight 1 0.28049 0.06860 4.09 0.0264 Individual Coefficients SAS Output ^ P ^ 0 ^ P-value ^ ^ βp/s ^ 1 2 p EPI809/Spring 2008
Testing Model Portions EPI809/Spring 2008
Testing Model Portions 1. Tests the Contribution of a Set of X Variables to the Relationship With Y 2. Null Hypothesis H0: g+1 = ... = k = 0 • Variables in Set Do Not Improve Significantly the Model When All Other Variables Are Included 3. Used in Selecting X Variables or Models EPI809/Spring 2008
Testing Model PortionsNested Models H0: Reduced model (g+1 = ... = k = 0) Ha: Full model EPI809/Spring 2008
F-Test for Nested Models • Numerator Reduction in SSE from additional parameters df = k-g = number of additional parameters • Denominator SSE of full model df=n-(k+1)=error df of full model EPI809/Spring 2008
Selecting Variables in Model Building EPI809/Spring 2008
Model Building with Computer Searches 1. Rule: Use as Few X Variables As Possible 2. Stepwise Regression • Computer Selects X Variable Most Highly Correlated With Y • Continues to Add or Remove Variables Depending on SSE 3. Best Subset Approach • Computer Examines All Possible Sets EPI809/Spring 2008
Residual Analysis for goodness of fit EPI809/Spring 2008
Residual (Estimated Errors) Analysis • Graphical Analysis of Residuals • Plot Estimated Errors vs. Xi Values (or pred.) • Plot Histogram or Stem-&-Leaf of Residuals • Purposes • Examine Functional Form (Linear vs. Non-Linear Model) • Evaluate Violations of Assumptions (to insure validity of the statistic tests on β’s) EPI809/Spring 2008
We recall Linear Regression Assumptions • Mean of Distribution of Error Is 0 • Distribution of Error Has Constant Variance • Distribution of Error is Normal • Errors Are Independent EPI809/Spring 2008
Residual Plot for Functional Form Nonlinear pattern Correct Specification EPI809/Spring 2008
Residual Plot for Equal Variance Unequal Variance Correct Specification Fan-shaped.Standardized residuals used typically (residual divided by standard error of prediction) EPI809/Spring 2008
Residual Plot for Independence Not Independent Correct Specification EPI809/Spring 2008
Residuals Diagnostics in SAS symbol v=dot h=2 c=green; PROCREG data=Cow; model milk = food weight; plot residual.*predicted. /cHREF=red cframe=ligr; run; EPI809/Spring 2008
Check for Outlying Observations and Influence analysis symbol v=dot h=2 c=green; procreg data=cow; model milk = food weight/influence; plot rstudent.*obs. / vref=-22 cvref=blue lvref=2 HREF=0 to 7 by 1 cHREF=red cframe=ligr; run; EPI809/Spring 2008
Influence analysis of each obs. The REG Procedure Model: MODEL1 Dependent Variable: Milk Output Statistics Hat Diag Cov -----------DFBETAS----------- Obs Residual RStudent H Ratio DFFITS Intercept Food weight 1 0.1701 0.8283 0.5473 3.0770 0.9108 0.8436 -0.5503 0.0565 2 0.0527 0.2040 0.4552 5.8235 0.1865 -0.0632 -0.0215 0.1145 3 0.0408 0.1688 0.5271 6.8398 0.1782 0.1530 0.0335 -0.1211 4 -0.0520 -0.2266 0.5678 7.2379 -0.2597 -0.0164 0.1767 -0.2170 5 -0.4155 -4.0459 0.2260 0.0056 -2.1863 -0.9217 -1.0080 1.0753 6 0.2039 1.4531 0.6766 1.2013 2.1019 -0.5540 1.7420 -0.9265 EPI809/Spring 2008
Multicollinearity 1. High Correlation Between X Variables 2. Coefficients Measure Combined Effect 3. Leads to Unstable Coefficients Depending on X Variables in Model 4. Always Exists 5. Example: Using Both Age & Height of children as indep. Var. in Same Model EPI809/Spring 2008
Detecting Multicollinearity 1. Examine Correlation Matrix • Correlations Between Pairs of X Variables Are More than With Y Variable 2. Examine Variance Inflation Factor (VIF) • If VIFj > 5 (or 10 according to most references), Multicollinearity Exists 3. Few Remedies • Obtain New Sample Data • Eliminate One Correlated X Variable EPI809/Spring 2008
SAS CODES :VET EXAMPLE PROC CORR data=vet; VAR milk food weight; run; EPI809/Spring 2008
Pearson Correlation Coefficients, N = 6 Prob > |r| under H0: Rho=0 Milk Food weight Milk 1.00000 0.90932 0.93117 0.0120 0.0069 Food 0.90932 1.00000 0.74118 0.0120 0.0918 weight 0.93117 0.74118 1.00000 0.0069 0.0918 Correlation Matrix SAS Computer Output All 1’s rY1 r12 rY2 EPI809/Spring 2008
Variance Inflation Factors SAS CODES /* VIF measures the inflation in the variances of the parameter estimates due to collinearity that exists among the regressors or (dependent) variables */ PROCREG data=Cow; model milk = food weight/VIF; run; EPI809/Spring 2008
Parameter Estimates Parameter Standard Variance Variable DF Estimate Error t Value Pr > |t| Inflation Intercept 1 0.06397 0.25986 0.25 0.8214 0 Food 1 0.20492 0.05882 3.48 0.0399 2.21898 weight 1 0.28049 0.06860 4.09 0.0264 2.21898 Variance Inflation Factors Computer Output VIF1 5 EPI809/Spring 2008
Types of Regression Models viewed from the explanatory variables standpoint EPI809/Spring 2008
Regression Models based on a Single Quantitative Explanatory Variable EPI809/Spring 2008
Types of Regression Models EPI809/Spring 2008
First-Order Model With 1 Independent Variable EPI809/Spring 2008
First-Order Model With 1 Independent Variable • 1. Relationship Between 1 Dependent & 1 Independent Variable Is Linear EPI809/Spring 2008
First-Order Model With 1 Independent Variable 1. Relationship Between 1 Dependent & 1 Independent Variable Is Linear 2. Used When Expected Rate of Change in Y Per Unit Change in X Is Stable EPI809/Spring 2008
First-Order Model Relationships 1 > 0 1 < 0 Y Y X X 1 1 EPI809/Spring 2008
First-Order Model Worksheet Run regression with Y, X1 EPI809/Spring 2008
Types of Regression Models EPI809/Spring 2008
Second-Order Model With 1 Independent Variable 1. Relationship Between 1 Dependent & 1 Independent Variables Is a Quadratic Function 2. Useful 1St Model If Non-Linear Relationship Suspected EPI809/Spring 2008
Second-Order Model With 1 Independent Variable 1. Relationship Between 1 Dependent & 1 Independent Variables Is a Quadratic Function 2. Useful 1St Model If Non-Linear Relationship Suspected 3. Model Curvilinear effect Linear effect EPI809/Spring 2008
Second-Order Model Relationships 2 > 0 2 > 0 2 < 0 2 < 0 EPI809/Spring 2008
Second-Order Model Worksheet Create X12 column. Run regression with Y, X1, X12. EPI809/Spring 2008
Types of Regression Models EPI809/Spring 2008
Third-Order Model With 1 Independent Variable 1. Relationship Between 1 Dependent & 1 Independent Variable Has a ‘Wave’ 2. Used If 1 Reversal in Curvature EPI809/Spring 2008
Third-Order Model With 1 Independent Variable 1. Relationship Between 1 Dependent & 1 Independent Variable Has a ‘Wave’ 2. Used If 1 Reversal in Curvature 3. Model Curvilinear effects Linear effect EPI809/Spring 2008
Third-Order Model Relationships 3 > 0 3 < 0 EPI809/Spring 2008
Third-Order Model Worksheet Multiply X1by X1 to get X12. Multiply X1by X1 by X1 to get X13. Run regression with Y, X1, X12, X13. EPI809/Spring 2008