210 likes | 229 Views
Joyful mood is a meritorious deed that cheers up people around you like the showering of cool spring breeze. Categorical Data Analysis. Chapter 8: Loglinear Models for Contingency Tables (SAS: Chapter 12). Loglinear vs. Logit Models.
E N D
Joyful mood is a meritorious deed that cheers up people around you like the showering of cool spring breeze.
Categorical Data Analysis Chapter 8: Loglinear Models for Contingency Tables (SAS: Chapter 12)
Loglinear vs. Logit Models • Loglinear models treat categorical variables equally, focusing on associations and interactions in their joint distribution. • Logit models, by contrast, describe how a single categorical variable (response) depends on other (explanatory) variables.
Distributions/Models for Categorical Data • Binomial data (logit models) • When the sample size n is fixed and the response per subject is binary • Use logit models; sas link is logit • Multinomial data (general logit models) • When the sample size n is fixed and the response per subject is multinary • Use base-category logit models (sas link: glogit) for nominal response; use culmulative logit models/proportional odds models (sas link: clogit) • Poisson data (loglinear models) • When the sample size n is not fixed and the response per subject is binary/multinary • A Poisson model conditioned on a given n is a Binomial/ Multinomial model (Sec. 1.2.5) PROC LOGISTIC PROC LOGISTIC PROC GENMOD
Example: Z = Sex Partial table Marginal table
What Models to Use • If the row totals are pre-fixed (prospective study): • Can only study the column (response) distribution for a given row (factors) • Bi- or multi-nomial data (logit models) • If the grand total is pre-fixed (prospective study): • Can study the joint distribution of response and factors (all variables) • Can be treated as bi- or multi-nomial data (logit models) • If nothing is pre-fixed/ totally observational (respective study): • Can study the joint distribution of response and factors (all variables) • Poisson data/ loglinear models
Loglinear Models for Counts • Poisson counts: count ~ Poisson(u) • Qualitative factors: X, Y, … • Saturated Model: As usual, the baseline (last level) effects are set as 0 for each term
Independence Model • No interaction effect between X and Y on counts; that is, X and Y are independent As usual, the baseline effects are set as 0 for each term
Interpretation of Parameters • The effect of factor on log(odds) is: For Ix2 table, level i vs. baseline level I • Without XY term: • With XY term:
Associations in 3-way Tables • Let Y be the response, X be the major factor and Z be nuisance factor • The observed marginal association of X on Y might be simply due to the other factor Z • In general we cannot collapse a 3-way table and interpret the 2-way marginal table
Example: Z = Sex Partial table Marginal table
Type of Independence of X, Y Conditionally independent given Z Mutually independent with Z Jointly independent of Z Marginally independent weak strong
Associations in 3-way Tables Eg. 2x2xK tables • Conditional odds ratio • Marginal odds ratio • Marginal independence of X, Y: marginal X-Y odds ratios are all 1 • Conditional independence of X, Y given Z: conditional X-Y odds ratios given Z are all 1; • Homogeneous association of X, Y given Z: conditional X-Y odds ratios given Z are the same (no need to be 1)
Partial Association (Sec 2.3) • The associations in partial tables are called “partial” associations between X and Y given Z • They are measured by conditional odds ratios
Associations in 3-way Tables • We need to condition on all important variables; but it is not practical. • In randomized experiments this (confounding) problem is less likely to happen. • To study whether an association exists between a primary factor and the response variable AFTER controlling for other possibly confounding variables, such as • Different medical centers • Severity of Condition • Age
Loglinear Models for 3-way Tables • Saturated (also full) model: Deviance is the Likelihood-Ratio test statistic of Ho: current model vs. H1: saturated model; can be a measure of goodness of fit
Interpreting Model parameters • X: effect of X on (expected) counts • XY: the partial association between X and Y given Z • XYZ: significant XY depends on Z insignificant XY does not depend on Z
Inference for Loglinear Models • Goodness-of-fit tests • Residuals • Tests for partial associations • Confidence intervals for odds ratios
The Loglinear-Logit Connection • Using logit models to interpret loglinear models • Correspondence between loglinear and logit models
Connection with Logit Models • The loglinear model which corresponds to a logit model is the one with the most general interaction among explanatory variables from the logit model. It has the same association and interaction structure relating the explanatory variables to the response.