290 likes | 492 Views
Interpreting multivariate OLS and logit coefficients. Jane E. Miller, PhD. Overview. What elements to report for coefficients Coefficients on Continuous independent variables (IVs; predictors) Categorical independent variables Ordinary least squares (OLS) and logit coefficients
E N D
Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD
Overview • What elements to report for coefficients • Coefficients on • Continuous independent variables (IVs; predictors) • Categorical independent variables • Ordinary least squares (OLS) and logit coefficients • Topic sentences for paragraphs reporting multivariate coefficients
Report and interpret results • Report detailed multivariate results in tables. • Coefficients. • Inferential statistical results: • standard error or test statistic, • p-value or symbol. • Model goodness of fit statistics. • Interpret coefficients in the text. • Refer to associated table.
What to report for coefficients • Topic • Independent variable (IV) • Dependent variable (DV) • Direction (AKA “sign”) • Magnitude (AKA “size”) • Units or categories • Statistical significance • Most authors remember to report statistical significance, so I have listed that element last!
Interpreting coefficients • Poor: “The effect of public insurance was –7.2 (p < 0.05).” • Reports the coefficient without interpreting it. Without units or reference group, the meaning of “– 7.2” cannot be interpreted. • Better: “Children with private insurancestayed on average 7.2 dayslonger than those with public insurance(p < 0.05).” • Interprets the β in intuitive terms, mentioning the topic, units, categories, direction, magnitude and statistical significance.
More examples of interpreting βs • Poor: “Insurance and length of stay were associated (p < 0.05).” • Topic and statistical significance, but not direction or size. • Better: “Privately-insured children stayedlonger than publicly insured children (p < 0.05).” • Statistical significance and direction, but not size. • Best: “Children with private insurancestayed on average 7.2 dayslonger than those with public insurance(p < 0.05).” • Topic, direction, magnitude, units, categories, and statistical significance.
Interpretation of βs depends on types of variables in your models • The type of dependent variable: • Continuous dependent variable • Ordinary least squares (OLS) • Categorical dependent variable • Logistic (logit) regression model • Type of independent variable: • Continuous • Categorical
Interpreting coefficients from ordinary least squares (OLS) models
Coefficients for OLS models • For ordinary least squares (OLS) models, the coefficient (β) is a measure of difference in the DV for a 1-unit increase in the IV. • For unstandardized coefficients, difference in the same units as the dependent variable. • Can be explained using wording for results of subtraction. • For standardized coefficients, β measures difference in standardized units (multiples of standard deviations). • See podcast about resolving the Goldilocks problem using model specification.
Interpreting βs: continuous predictors • The unstandardized coefficient on a continuous predictor in an OLS model measures • The difference in the dependent variable for a one-unit increase in the independent variable. • Effect size is in original units of the DV. • Example topic: Mother’s age as a predictor of birth weight: • Dependent variable = birth weight in grams. • Independent variable = mother’s age in years. • Both are continuous variables.
Example: Mother’s age as a predictor of birth weight • Poor: “Mother’s age andchild’s birth weight are correlated (p<0.01).” • Names the dependent and independent variables and conveys statistical significance, but not direction or magnitude of the association. • Better: “As mother’s age increases, her child’s birth weight also increases(p<0.01).” • Concepts, direction, and statistical significance, but not size. • Best: “For each additional year of mother’s age at the time of her child’s birth, the child’s birth weight increases by 10.7grams(p<0.01).” • Concepts, units, direction, size, and statistical significance.
Interpreting βs: categorical predictors • The β on a categorical IV in an OLS model measures the difference in the DV for the category of interest compared to the reference category. • A “1-unit increase” does NOT make sense. • Example: gender • Dummy variable (AKA “binary variable”) coded • 1 = boy • 0 = girl = omitted (reference) category
Example: Gender as a predictor of birth weight • Poor: “The β for ‘BBBOY’ is 116.1 with an s.e. of 12.3 (table 15.3).” • Uses a cryptic acronym rather than naming the independent variable or conveying that it is categorical. • Doesn’t convey the dependent variable. • Reports the same information as the table (size of coefficient and standard error), but does not interpret them. • The direction of the effect cannot be determined because categories and units are not specified. • To assess statistical significance, readers must calculate test statistic and compare it against critical value.
Gender as a predictor of birth weight, cont. • Slightly better: “Gender is associated with a difference of 116.1grams in birth weight (p < 0.01).” • Concepts, magnitude, units, and statistical significance but not direction: Was birth weight higher for boys or for girls? • Best: “At birth, boys weigh on average 116gramsmorethan girls (p < 0.01).” • Concepts, reference category and units, direction, magnitude, and statistical significance.
Identifying the reference category • For categorical variables, mention identity of reference category. • E.g., effect size is relative to whom? • Example for 2-category comparison: • “Boys weighed 116 grams more than girls.” • Example for multicategory comparison: • “Compared to white infants, black and Hispanic infants weighed 62 and 16 grams less on average.” • OR “Mean birth weight was 62 and 16 grams less, for black and Hispanic infants, respectively, when each is compared to white infants.”
Logit models for categorical dependent variables • Logit = log[p/(1 – p)] = log(odds of the category you are modeling) • p is the proportion of the sample in the modeled category • βmeasures the log relative-odds of the outcome for different values of the independent variable • Exponentiate the logit coefficient eβ= relative odds, or “odds ratio” • Compares the odds of the outcome for different values of the independent variable
Example: Logit model of LBW • Low birth weight (LBW) = birth weight <2,500 grams • Log-odds = log[pLBW/(1 – pLBW )] • Where pLBW is the proportion of the sample that is LBW. • Log relative odds of LBW = comparison of log-odds of LBW for different values of the independent variable. • eβ= relative odds of LBW for different values of the independent variable.
Wording for odds ratios • βs for logit models are in the form of ratios. • For suggestions on how to phrase descriptions of ratios with minimal jargon, see • Table 5.3 in The Chicago Guide to Writing about Numbers OR • Table 8.3 in The Chicago Guide to Writing about Multivariate Analysis
Phrases for ratios See tables in WA#s or WAMA
Odds ratios for categorical independent variables • Odds ratio of the outcome for the category of interest compared to the reference category. • “Infants born to smokers had 1.4times the odds of low birth weight (LBW) as those born to nonsmokers (p < 0.01).” • Concepts,reference category,direction,magnitude, and statistical significance.
Odds ratios for continuous independent variables • Odds ratio of the outcome for a one-unit increase in the independent variable. • “Odds of LBWdecreased by about 0.8%for each 1 year increase in mother’s age (NS).” • Concepts,units,direction,magnitude, and statistical significance.
Topic sentences for paragraphs reporting multivariate results • Start each paragraph of the results section with a restatement of topic addressed by analysis to be reported in that paragraph. • Can paraphrase title of table or chart that reports the detailed statistical results. • Topic sentence should mention: • Dependent variable. • Independent variable(s). • Use summary phrase rather than long list of variables. • Type of analysis.
Example topic sentences • “Multivariate logistic regressionresults show thatinsuranceis a powerful predictor oflength of stay (table X).” [Next sentence goes into detail about direction, size, and statistical significance.] • Mentionstype of analysis, dependent variable, andindependent variable. • “As shown in figure Y,race and income levelinteract in their effect on risk of asthma.”
Summary • Report detailed multivariate results in tables. • Interpret coefficients in prose. • Specify direction, magnitude, and statistical significance of associations. • Units for continuous variables • Categories for nominal or ordinal variables • Write about concepts, not acronyms. • Introduce concepts under study in topic sentences.
Suggested resources • Miller, J. E. 2013. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. • Chapter 5, on creating effective multivariate tables • Chapter 8, on wording for results of • subtraction (OLS βs) • ratios (logit βs) • Chapter 9, on writing about βs from OLS and logit models • Chapter 10, on the Goldilocks problem for choosing a fitting contrast size for interpreting coefficients
Suggested online resources • Podcasts on • Comparing two numbers or series • Choosing a reference category • Defining the Goldilocks problem • Resolving the Goldilocks problem: Presenting results • Differentiating between statistical significance and substantive importance
Suggested practice exercises • Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. • Problem sets for chapters 9 and 15 • Suggested course extensions for chapters 9 and 15 • “Reviewing,” “writing” and “revising” exercises.
Contact information Jane E. Miller, PhD jmiller@ifh.rutgers.edu Online materials available at http://press.uchicago.edu/books/miller/multivariate/index.html