Summer School

Summer School Week 2

Contents • Logistic regression refresher • Some familiar + some less familiar polytomous models • 1PL/2PL in Stata and R • PCM/RSM/GRM in R • Link IRT to CFA/UV in Mplus • DIF/MIMIC in Mplus

Types of outcome • Two categories • Binary / dichotomous • Ordered • e.g. low birthweight (< 2500g), height > 6ft, age > 70 • Unordered • e.g. gender, car-ownership, disease status • Presence of ordering is unimportant for binaries

Types of outcome • 3+ categories • Polytomous • Ordered (ordinal) • Age (<30,30-40,41+) • “Likert” items (str disagree, disagree, …, str agree) • Unordered (nominal) • Ethnicity (white/black/asian/other) • Pet ownership (none/cat/dog/budgie/goat)

Modelling options (LogR/IRT)

Binary Logistic Regression

Binary Logistic Regression Probability of a positive response / outcome given a covariate Intercept Regression coefficient

Binary Logistic Regression Probability of a negative response

Logit link function • Probabilities only in range [0,1] • Logit transformation is cts in range (–inf,inf) • Logit is linear in covariates

Simple example – cts predictor • Relationship between birthweight and head circumference (at birth) • Exposure • birthweight (standardized) variable | mean sd ----------+------------------ bwt | 3381.5g 580.8g -----------------------------

Simple example – cts predictor • Outcome • Head-circumference ≥ 53cm headcirc | Freq. % ----------+----------------- 0 | 8,898 84.4% 1 | 1,651 15.7% ----------+----------------- Total | 10,549

Simple example – cts predictor The raw data – doesn’t show much

Simple example – cts predictor Logistic regression models the probabilities (here shown for deciles of bwt) | bwt_z_grp headcirc | 0 1 2 3 4 | -----------+-------------------------------------------------------+-- 0 | 1,006 993 1,050 946 1,024 | | 99.80 98.12 97.95 96.04 93.35 | -----------+-------------------------------------------------------+-- 1 | 2 19 22 39 73 | | 0.20 1.88 2.05 3.96 6.65 | -----------+-------------------------------------------------------+-- headcirc | 5 6 7 8 9 | Total -----------+-------------------------------------------------------+---------- 0 | 931 922 856 688 381 | 8,797 | 89.95 84.98 81.84 66.67 35.94 | 84.33 -----------+-------------------------------------------------------+---------- 1 | 104 163 190 344 679 | 1,635 | 10.05 15.02 18.16 33.33 64.06 |15.67 -----------+-------------------------------------------------------+----------

Simple example – cts predictor Increasing, non-linear relationship

Simple example – cts predictor Logistic regression Number of obs = 10432 LR chi2(1) = 2577.30 Prob > chi2 = 0.0000 Log likelihood = -3240.9881 Pseudo R2 = 0.2845 ------------------------------------------------------------------------------ headcirc | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- bwt_z | 7.431853 .378579 39.38 0.000 6.72569 8.212159 ------------------------------------------------------------------------------ Or in less familiar log-odds format ------------------------------------------------------------------------------ headcirc | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- bwt_z | 2.005775 .0509401 39.38 0.000 1.905935 2.105616 _cons | -2.592993 .0474003 -54.70 0.000 -2.685896 -2.50009 ------------------------------------------------------------------------------

Simple example – cts predictor Fitted model – logit scale

Simple example – cts predictor Fitted model – logit scale Cons = -2.59 Slope = 2.00

Simple example – cts predictor But also…a logit of zero represents point at which both levels of outcome are equally likely

Simple example – cts predictor Fitted model – probability scale

Simple example – cts predictor Fitted model – probability scale Point at which curve changes direction

Simple example – cts predictor Observed and fitted values (within deciles of bwt)

LogR cts predictor - summary • Logit is linearly related to covariate • Gradient gives strength of association • Intercept is related to prevalence of outcome • + seldom used • Non-linear (S-shaped) relationship between • probabilities and covariate • Steepness of linear-section infers • strength of association • Point at which curve changes direction is where • P(u=1|X) = P(u=0|X) can be thought of as the • location + isrelated to prevalence of outcome

LogR – binary predictor • Define binary predictor: bwt ≥ 8lb • 32% of the sample had a birthweight of 8lb+ • Same outcome • Head circumference > 53cm • Does being 8lb+ at birth increase the chance of you being born with a larger head?

Association can be cross-tabbed headcirc bwt_8lb | 0 1 | Total -----------+----------------------+---------- 0 | 6,704 384 | 7,088 | 94.58 5.42 | 100.00 -----------+----------------------+---------- 1 | 2,093 1,251 | 3,344 | 62.59 37.41 | 100.00 -----------+----------------------+---------- Total | 8,797 1,635 | 10,432 | 84.33 15.67 | 100.00

Association can be cross-tabbed headcirc bwt_8lb | 0 1 | Total -----------+----------------------+---------- 0 | 6,704384 | 7,088 | 94.58 5.42 | 100.00 -----------+----------------------+---------- 1 | 2,0931,251 | 3,344 | 62.59 37.41 | 100.00 -----------+----------------------+---------- Total | 8,797 1,635 | 10,432 | 84.33 15.67 | 100.00 Familiar with (6704*1251)/(2093*384) = 10.43 = odds-ratio

Association can be cross-tabbed headcirc bwt_8lb | 0 1 | Total -----------+----------------------+---------- 0 | 6,704384 | 7,088 | 94.58 5.42 | 100.00 -----------+----------------------+---------- 1 | 2,0931,251 | 3,344 | 62.59 37.41 | 100.00 -----------+----------------------+---------- Total | 8,797 1,635 | 10,432 | 84.33 15.67 | 100.00 Familiar with (6704*1251)/(2093*384) = 10.43 = odds-ratio However ln[(6704*1251)/(2093*384)] = 2.345 = log odds-ratio

Association can be cross-tabbed headcirc bwt_8lb | 0 1 | Total -----------+----------------------+---------- 0 | 6,704 384 | 7,088 | 94.58 5.42 | 100.00 -----------+----------------------+---------- 1 | 2,093 1,251 | 3,344 | 62.59 37.41 | 100.00 -----------+----------------------+---------- Total | 8,797 1,635 | 10,432 | 84.33 15.67 | 100.00 Familiar with (6704*1251)/(2093*384) = 10.43 = odds-ratio However ln[(6704*1251)/(2093*384)] = 2.345 = log odds-ratio and ln[(384)/(6704)]= ln(0.057) = -2.86 = intercept on logit scale

Logit output (from Stata) . logit headcirc bwt_8lb Logistic regression Number of obs = 10432 LR chi2(1) = 1651.89 Prob > chi2 = 0.0000 Log likelihood = -3703.6925 Pseudo R2 = 0.1823 ------------------------------------------------------------------------------ headcirc | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- bwt_8lb | 2.345162 .063486 36.94 0.000 2.220732 2.469592 _cons | -2.859817 .0524722 -54.50 0.000 -2.962661 -2.756974 ------------------------------------------------------------------------------

What lovely output figures! There is still an assumed s-shape on probability scale although the curve is not apparent Linear relationship in logit space

What lovely output figures! Intercept = -2.86 Slope = 2.35 There is still an assumed s-shape on probability scale although the curve is not apparent Linear relationship in logit space

LogR binary predictor - summary • The same maths/assumptions underlie the models with a binary predictor • Estimation is simpler – can be done from crosstab rather than needing ML • Regression estimates relate to linear relationship on logit scale

Multinomial Logistic Regression

Multinomial Logistic Regression • Typically used for non-ordinal (nominal) outcomes • Can be used for ordered data (some information is ignored) • 3+ outcome levels • Adding another level adds another set of parameters so more than 4 or 5 levels can be unwieldy

Multinomial Logistic Regression where c0 = α0 = 0 Here the probabilities are obtained by a “divide-by-total” procedure

Examples • Outcome: head-circumference • 4 roughly equal groups (quartiles) • Ordering will be ignored headcirc4 | Freq. Percent ------------+--------------------------- <= 49cm | 2,574 24.4% 49.1–50.7cm | 2,655 25.2% 50.8–51.9cm | 2,260 21.4% 52+ cm | 3,060 29.0% ------------+--------------------------- Total | 10,549 100.00 • Exposure 1: birthweight of 8lb or more • Exposure 2: standardized birthweight

Exposure 1: bwt > 8lb • 32% of the sample had a birthweight of 8lb+ • Does being 8lb+ at birth increase the chance of you being born with a larger head • Unlike the logistic model we are concerned with three probabilities • P(headcirc = 49.1 – 50.7cm) • P(headcirc = 50.8 – 51.9cm) • P(headcirc = 52+cm) • Each is referenced against the “negative response” i.e. that headcirc <= 49cm

Exposure 1: bwt > 8lb . mlogit headcirc4 bwt_8lb, baseoutcome(0) Multinomial logistic regression -------------------------------------------------------------- headcirc4 | Coef. SE z P>|z| [95% CI] -------------+------------------------------------------------ 1 | bwt_8lb | 1.56 .135 11.53 0.000 1.30 1.83 _cons | -.07 .029 -2.30 0.022 -0.12 -0.01 -------------+------------------------------------------------ 2 | bwt_8lb | 3.09 .129 23.98 0.000 2.84 3.34 _cons | -.58 .034 -17.33 0.000 -0.65 -0.52 -------------+------------------------------------------------ 3 | bwt_8lb | 4.39 .127 34.43 0.000 4.14 4.64 _cons | -.99 .039 -25.56 0.000 -1.06 -0.92 -------------------------------------------------------------- (headcirc4==0 is the base outcome) 3 sets of results Each is reference against the “baseline” group, i.e. <=49cm

Exposure 1: bwt > 8lb . mlogit headcirc4 bwt_8lb, baseoutcome(0) Multinomial logistic regression --------------------------------- headcirc4 | Coef. (SE) -------------+------------------- 1 | bwt_8lb | 1.56 (.135) _cons | -.07 (.029) -------------+------------------- 2 | bwt_8lb | 3.09 (.129) _cons | -.58 (.034) -------------+------------------- 3 | bwt_8lb | 4.39 (.127) _cons | -.99 (.039) --------------------------------- (headcirc4==0 is the base outcome) Logistic regression ---------------------------------- head_1 | Coef. Std. Err. --------+------------------------- bwt_8lb | 1.56099 .1353772 _cons | -.0664822 .0289287 ---------------------------------- Logistic regression ---------------------------------- head_2 | Coef. Std. Err. --------+------------------------- bwt_8lb | 3.088329 .1287576 _cons | -.5822197 .0335953 ---------------------------------- Logistic regression ---------------------------------- head_3 | Coef. Std. Err. --------+------------------------- bwt_8lb | 4.389338 .127473 _cons | -.9862376 .0385892 ----------------------------------

Exposure 1: bwt > 8lb • For a categorical exposure, a multinomial logistic model fitted over 4 outcome levels gives the same estimates as 3 logistic models, i.e. Logit(0v1) Multinomial(0v1,0v2,0v3) ≡ Logit(0v2) Logit(0v3) • In this instance, the single model is merely more convenient and allows the testing of equality constraints

Exposure 2: Continuous bwt • Using standardized birthweight we are interesting in how the probability of having a larger head, i.e. • P(headcirc = 49.1 – 50.7cm) • P(headcirc = 50.8 – 51.9cm) • P(headcirc = 52+cm) increases as birthweight increases As with the binary logistic models, estimates will reflect • A change in log-odds per SD change in birthweight • The gradient or slope when in the logit scale

Exposure 2: Continuous bwt mlogit headcirc4 bwt_z, baseoutcome(0) Multinomial logistic regression -------------------------------------------------------------- headcirc4 | Coef. SE z P>|z| [95% CI] -------------+------------------------------------------------ 1 | bwt_z | 2.10 .063 33.11 0.000 1.97 2.22 _cons | 1.06 .044 23.85 0.000 0.97 1.14 -------------+------------------------------------------------ 2 | bwt_z | 3.52 .078 44.89 0.000 3.37 3.68 _cons | 0.78 .046 16.95 0.000 0.69 0.87 -------------+------------------------------------------------ 3 | bwt_z | 4.88 .086 56.90 0.000 4.72 5.05 _cons | 0.33 .051 6.51 0.000 0.23 0.43 -------------------------------------------------------------- (headcirc4==0 is the base outcome)

Exposure 2: Continuous bwt Logistic regression ------------------------------------- head_1 | Coef. Std. Err. -------+----------------------------- bwt_z | 2.093789 .0650987 _cons | 1.058445 .0447811 ------------------------------------- Logistic regression ------------------------------------- head_2 | Coef. Std. Err. -------+----------------------------- bwt_z | 3.355041 .0959539 _cons | .6853272 .0464858 ------------------------------------- Logistic regression ------------------------------------- head_3 | Coef. Std. Err. -------+----------------------------- bwt_z | 3.823597 .1065283 _cons | .3129028 .0492469 ------------------------------------- Multinomial logistic regression ------------------------------ headcirc4 | Coef. (SE) -------------+---------------- 1 | bwt_z | 2.10 (.063) _cons | 1.06 (.044) -------------+---------------- 2 | bwt_z | 3.52 (.078) _cons | 0.78 (.046) -------------+---------------- 3 | bwt_z | 4.88 (.086) _cons | 0.33 (.051) ------------------------------ (headcirc4==0 is the base outcome) No longer identical

Exposure 2: Continuous bwt Outcome level 2 = [49.1 – 50.7] Intercept = 1.06 Slope = 2.10 Shallowest Outcome level 3 = [50.8 – 51.9] Intercept = 0.78 Slope = 3.52 Outcome level 4 = [52.0 –] Intercept = 0.33 Slope = 4.88 Steepest Risk of being in outcome level 4 increases most sharply as bwt increases

Can plot probabilities for all 4 levels

Or altogether on one graph….

Ordinal Logistic Models

Ordinal Logistic Models • When applicable, it is useful to favour ordinal models over multinomial models • If outcome levels are increasing, e.g. in severity of a condition or agreement with a statement, we expect the model parameters to behave in a certain way • The typical approach is to fit ordinal models with constraints resulting in greater parsimony (less parameters)

Contrasting among response categories - some alternative models For a 4-level outcome there are three comparisons to be made Model 1 – that used in the multinomial logistic model Model 2 – used with the proportional-odds ordinal model Model 3 – adjacent category model

Summer School

Summer School

Presentation Transcript

Summer School 2007

Summer School

Summer School

SUMMER SCHOOL

Summer School

Summer School 2013

Summer School Rules

3rd Summer School

Summer School

« Summer school »

Summer/School

Summer School

Summer School

Summer School 2004

Summer school

SUMMER SCHOOL

Summer School 2010

Summer School

Summer School

Summer School

Summer School

Summer School Survey