270 likes | 285 Views
Testing statistical significance of differences between coefficients. Jane E. Miller, PhD. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Overview. Review: Inferential statistical tests for coefficients Testing statistical significance of differences
E N D
Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Overview • Review: Inferential statistical tests for coefficients • Testing statistical significance of differences • Between coefficients in the same model • Between coefficients in independent models • Standard error of the difference • Presenting results of tests of differences between coefficients The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Review: Statistical significance of βs • In the standard output from a regression model, inferential statistics provide the information to test whether the coefficient on an independent variable is statistically significantly different from zero • For continuous independent variables • Whether the marginal effect of a one-unit increase in that IV is different from zero • For categorical independent variables • Whether difference between the mean of the DV for the specified group and the reference category is different from zero The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
** denotes p < 0.01 Reference category in parenthesis The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Example: β on a continuous IV • OLS model of birth weight in grams includes mother’s age in years as an independent variable • βmother’s age= 10.7 with a standard error (s.e.) of 1.2, p < 0.001 • Thus we reject the null hypothesis H0:βmother’s age= 0 • We conclude that the slope of the association between mother’s age and birth weight is statistically significantly different from zero at p < 0.001 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Example: β on a categorical IV • The birth weight model includes an ordinal measure of mother’s educational attainment • > HS is the reference category • β<HS= –55.5 with a standard error (s.e.) of 19.3, p < 0.001 • Thus we reject the null hypothesis H0:β<HS= 0 H0: mean birth weight for < HS = mean birth weight for > HS • Mean birth weight for infants born to mothers with < high school education is statistically significantly different from those born to mothers with > HS The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Testing other hypotheses • For some research questions, you might need to test a hypothesis in addition to i= 0. E.g., whether • Two s in a given model are statistically significantly different from one another • E.g., <HS = =HS • The size and statistical significance of a changes across models when additional covariates such as confounders or mediators are included in the model • E.g., H0: non-Hispanic black (I)= non-Hispanic black (II) • The effect of a covariate differs across models estimated for independent subgroups (stratified models) • E.g., H0: <HS is the same for males as for females The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Testing statistical significance of differences between coefficients • To formally test statistical significance of differences between coefficients, e.g., H0:βj = βk • Divide the difference between the estimated coefficients (j − i) by the standard error of the difference to obtain the test statistic • Compare the calculated test statistic against the pertinent critical value with one degree of freedom The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Standard error of the difference • The standard error of the difference is calculated: √[var(j) + (2 × cov(j,k)) + var(k)] • var(j) and var(k) are the variances of jand k • cov(j,k) is the covariance between j and k • When j and k are fromdifferent models • Considered statistically independent of one another • cov(j, k)= 0 • When j and k are from within one regression model • Notindependent of one another • cov(j,k)≠ 0 Square root The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Testing differences of s from one model • When j and k are from the same model, must include the covariance in the calculation of the standard error of the difference √[var(j) + (2 × cov(j,k)) + var(k)] • The complete variance-covariance matrix for a regression can be requested as part of the output • The variance of each coefficient can be calculated from its standard error (s.e.) var(j) = [s.e.(j)]2 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Example: Testing whether β<HS = β=HS • From the table, <HS = –55.5 and =HS = –53.9 • The difference between β<HS andβ=HS is calculated β<HS – β=HS = –55.5 –(–53.9) = 1.6 • For that model, • var(<HS) = 370.9 • var(=HS) = 218.8 • cov(<HS, =HS) = 137.8 • Plugging those values into the formula for the standard error of the difference yields = √[370.9 + (2 × 137.8) + 218.8] = 17.72 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Example, cont.: Testing β<HS = β=HS • To calculate the test statistic, divide the difference between <HS and =HS by the standard error of the difference: (β<HS – β=HS)/s.e. (β<HS – β=HS) = 1.6/17.7 = 0.09 • 0.09 < 1.96 (the critical value of 1.96 for a t-test with ∞ degrees of freedom at p < 0.05) • Cannot reject the null hypothesis that β<HS = β=HS The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
TEST statement Neither is the reference category • Many software packages can do these calculations for you • To test other contrasts among categories, request the test statistic for equality of coefficients for pairs of coefficients: H0 :βj = βk • E.g., to test whether predicted birth weight is statistically significantly different for infants born to mothers with < HS than for those with = HS • Specify “TEST ‘<HS’ = ‘=HS’” in your SAS syntax • Output for H0:β<HS = β=HS reports an F-statistic of 0.01 with a p-value of 0.93 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Testing differences of s from independentmodels • When j and k are from different models they can be assumed to be independent of one another cov(j,k) = 0 • Thus the formula for the standard error of the difference √[var(j) + (2 × cov(j,k)) + var(k)] simplifies to √[var(j) + var(k)] • Reminder: var(j ) and var(k)can be calculated from the standard error reported in the regression output var(j) = [s.e.(j)]2 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Example: Change in βs across nested models • In nested models I and II, s on non-Hispanic black are NHB(I) = –244.5 , s.e. = 16.7 NHB (II) = –147.2 , s.e. = 17.6 • The change in βbetween models I and II: –244.5 –(–147.2 ) = 97.3 • Plugging the standard errors for NHB(I) and NHB(II) into the formula for standard error of the difference yields (s.e. difference) = √ [(16.7)2 +(17.6)2] = 24.3 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Change in βs across nested models, cont. • The t statistic for the difference in β is calculated: (difference in β)/ s.e.(difference in β) • Plugging in the values from the previous slide: 97.3 ÷ 24.3 = 4.01 • 4.01 exceeds the critical value of 2.56 for p < 0.01, so we conclude that the change in NHB between models I and II is statistically significant at p < 0.01 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Tables to present multivariate results • In the table of multivariate statistics, for each independent variable in the model, present • The estimated coefficient () • The standard error • See chapters 5 and 11 of Writing about Multivariate Analysis, 2nd Edition for guidelines and examples of multivariate tables The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Prose to present results of differences between coefficients • Introduce the substantive reason behind the test for difference between s, given your • Research question • Variables (categories, units) • Report and interpret the results of the formal statistical test of difference between coefficients • Test statistic • Accompanying degrees of freedom • Explain the conclusions you draw from that test about specification of your model The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Poor presentation:Results of test differences between s • “From table 15.3, Model III we have <HS = –55.5 and =HS = –53.9, so the difference between β<HS andβ=HS is β<HS – β=HS = –55.5 – (–53.9) = 1.6. For that model, var(<HS) = 370.9, var(=HS) = 218.8, and cov(<HS, =HS) = 137.8. Plugging those values into the formula for the standard error of the difference yields √[370.9 + (2 × 137.8) + 218.8] = 17.7. To calculate the test statistic, divide the difference between <HS and =HS by the standard error of the difference: (β<HS – β=HS)/s.e. (β<HS – β=HS) = 1.6/17.7 = 0.09, which is less than the critical value of 1.96 for a t-test with ∞ degrees of freedom at p < 0.05). Thus we cannot reject the null hypothesis that β<HS = β=HS.” • Except for an assignment in a course where you must demonstrate that you know this logic, skip the statistics lesson to your readers! The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Better presentation:Results of test differences between s • “The 1.6 unit (gram) birth weight difference between the estimated coefficients for ‘less than high school’ and ‘high school graduate’ in Model III is not statistically significant (F-statistic for the test of difference = 0.01; p = 0.93).” • Mentions the • Dependent variable • Independent variable (educational attainment) • Units or categories • Purpose of the test for a change in NHB across nested models • Magnitude • Statistical significance • Direction (not mentioned because trivially small) The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Example presentation: Change in across nested models • “As shown in table 15.3, the coefficient on non-Hispanic black decreases93 points (grams), from –244.5 in model I to –147.2 in model II (t = 4.01; p < 0.01). Thus, the addition of controls for socioeconomic characteristics is associated with a large, statistically significant decrease in the birth weight deficit for non-Hispanic black compared to non-Hispanic white infants.” • Mentions the • Dependent variable • Independent variables and their units or categories • Purpose of the test for a change in NHB across nested models • Direction • Magnitude • Statistical significance
Summary • To test hypotheses other than H0:βi= 0, calculate a test statistic from the difference in coefficients and the standard error of the difference • Compare that test statistic against the critical value • βsfrom different models are considered statistically independent of one another, so the covariance is not needed to compute standard error of the difference • E.g., nested models, stratified models • βsfrom the same model are not statistically independent of one another, so the covariance is needed to compute standard error of the difference The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Summary, cont. • If coefficients are not statistically significantly different from one another, the model specification often can be simplified by combining terms • Then test effect of simplified specification on overall fit using model GOF statistics • Present results of difference between coefficients • Use a combination of tables and prose • Describe conclusions, not process • Relate to topic at hand
Suggested resources • Miller, J. E. 2013. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. University of Chicago Press. Chapters 11 and 15. • Freedman, David, Robert Pisani, and Roger Purves. 2007. Statistics, 4th Edition. New York: W. W. Norton. • Gujarati, Damodar N. 2002. Basic Econometrics, 4th Edition. New York: McGraw-Hill/Irwin. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Suggested online resources • Podcasts on • Interpreting coefficients from OLS and logit models • Comparing overall goodness of fit across models • Testing whether a multivariate specification can be simplified The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Suggested practice exercises • Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. • Questions #2, 3, and 5 in the problem set for chapter 11 • Suggested course extensions for chapter 11 • “Reviewing” exercise #2 • “Applying statistics and writing” exercise #3 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Contact information Jane E. Miller, PhD jmiller@ifh.rutgers.edu Online materials available at http://press.uchicago.edu/books/miller/multivariate/index.html The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.