1 / 60

SC968: Panel Data Methods for Sociologists

SC968: Panel Data Methods for Sociologists. Random coefficients models. Overview. Random coefficients models Continuous data Binary data Growth curves. Random coefficients models. Also known as Multilevel models MLwiN http://www.cmm.bristol.ac.uk/ Hierarchical models HLM

natala
Download Presentation

SC968: Panel Data Methods for Sociologists

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SC968: Panel Data Methods for Sociologists Random coefficients models

  2. Overview • Random coefficients models • Continuous data • Binary data • Growth curves

  3. Random coefficients models Also known as • Multilevel models • MLwiN • http://www.cmm.bristol.ac.uk/ • Hierarchical models • HLM • http://www.ssicentral.com/ • Mixed models • Stata

  4. Random coefficients modelsfor continuous outcomes

  5. Random coefficients models • We started off with OLS models that pooled data across many waves of panel data • We separated the between and within variance with fixed effects and between effects models • Then we allowed intercepts to vary for each individual using random effects models • We can also allow the coefficients for the independent variables to vary for each individual • These models are called random coefficients or random slopes models

  6. Example of a random coefficients model • Schools’ mean maths scores and student socioeconomic status (SES) • Students at level 1 nested within schools at level 2 • Using a random coefficients model we can estimate • Overall mean maths score • How SES relates to individual maths scores • Within school variability in maths scores • Between school variability in mean maths scores • Between school variability in the relationship between SES and individual maths scores

  7. Another example • Children’s emotional problems • Suppose we have problems measured each year from 2000-2007 for pupils in a junior school • Want to know if a school policy implemented in 2004 reduces problems • Emotional problems for each year at level-1 and pupils at level-2 • Using a random coefficients model we can examine • Levels of emotional problems, averaged over years 2000-2007 • Within pupil variability in emotional problems • Between pupil variability in emotional problems • Whether the intervention reduced emotional problems • Whether the intervention had different effects for different children • What pupil characteristics made the intervention more or less successful

  8. Possible combinations of slopes and intercepts with panel data Constant slopes Constant intercept The OLS model

  9. Possible combinations of slopes and intercepts with panel data Constant slopes Varying intercepts The random effects model

  10. Possible combinations of slopes and intercepts with panel data Varying slopes Constant intercept Unlikely to occur

  11. Possible combinations of slopes and intercepts with panel data Varying slopes Varying intercepts Random coefficients model - separate regression for each individual

  12. Random coefficients model for continuous data Fixed coefficients Residual Random coefficients

  13. Random coefficients model for continuous data Fixed intercept Random intercept Random slope Random error Fixed slope

  14. V ariance explained by Total variance at each level predictors V ariance due to random intercept R emaining V ariance due unexplained to random variance slopes R emaining unexplained variance R emaining unexplained variance Partitioning unexplained variance in a random coefficients model

  15. Steps in multi-level modelling (Hox,1995) 1. Compute variance for the baseline/null/unconditional model which includes only the intercept. 2. Compute variance for the model with level-1 independent variables included and the variance components of the slopes constrained to zero (that is, a fixed coefficients model). 3. Use a chi-square difference test to see if the fixed coefficients model has a significantly better fit than the baseline model. If it does, then proceed to investigate random coefficients. At this stage can drop non-significant level-1 independents from the model.

  16. Steps in multi-level modelling (Hox,1995) 4. Identify which level-1 regression coefficients have significant variance across level-2 groups. Compute -2LL for the model with the variance components of the level-1 coefficients constrained to zero only for the coefficients which do not have significant variance across level-2 groups. 5. Add level-2 independent variables, determining which improve model fit. Drop variables which do not improve model fit. 6. Add cross-level interactions between explanatory level-2 variables and level-1 independent variables that had random coefficients (in step 3). Drop interactions which do not improve model fit.

  17. Worked example • Random 20% sample from BHPS • Waves 1 - 15 • Ages 21 to 59 • Outcome: GHQ likert scores • Explanatory variable: household income last month (logged)

  18. Random coefficients model example where yij = GHQ score for subject i,j = 1,…, J xij = logged household income in month to wave j β1 = mean slope bi = subject-specific random deviation from mean slope ui = subject-specific random intercept

  19. Linear random coefficients model Stata output

  20. . xtmixed hlghq1 lnfihhmn || pid: lnfihhmn, mle cov(unstr) variance Mixed-effects ML regression Number of obs = 18541 Group variable: pid Number of groups = 2508 Obs per group: min = 1 avg = 7.4 max = 15 Wald chi2(1) = 28.11 Log likelihood = -55286.004 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ hlghq1 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lnfihhmn | -.4015666 .075741 -5.30 0.000 -.5500162 -.253117 _cons | 14.40864 .5917387 24.35 0.000 13.24885 15.56843 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ pid: Unstructured | var(lnfihhmn) | 2.073304 .3231129 1.527594 2.813961 var(_cons) | 144.5579 19.37199 111.1664 187.9793 cov(lnfihhmn,_cons) | -16.63265 2.488823 -21.51065 -11.75465 -----------------------------+------------------------------------------------ var(Residual) | 18.10746 .2092684 17.70191 18.5223 ------------------------------------------------------------------------------ LR test vs. linear regression: chi2(3) = 5099.77 Prob > chi2 = 0.0000 Random slopes Fixed effect Estimates covariance between all random effects Least restrictive model

  21. . xtmixed hlghq1 lnfihhmn || pid: lnfihhmn, mle cov(unstr) variance Mixed-effects ML regression Number of obs = 18541 Group variable: pid Number of groups = 2508 Obs per group: min = 1 avg = 7.4 max = 15 Wald chi2(1) = 28.11 Log likelihood = -55286.004 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ hlghq1 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lnfihhmn | -.4015666 .075741 -5.30 0.000 -.5500162 -.253117 _cons | 14.40864 .5917387 24.35 0.000 13.24885 15.56843 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ pid: Unstructured | var(lnfihhmn) | 2.073304 .3231129 1.527594 2.813961 var(_cons) | 144.5579 19.37199 111.1664 187.9793 cov(lnfihhmn,_cons) | -16.63265 2.488823 -21.51065 -11.75465 -----------------------------+------------------------------------------------ var(Residual) | 18.10746 .2092684 17.70191 18.5223 ------------------------------------------------------------------------------ LR test vs. linear regression: chi2(3) = 5099.77 Prob > chi2 = 0.0000 Fixed coefficient Fixed intercept

  22. . xtmixed hlghq1 lnfihhmn || pid: lnfihhmn, mle cov(unstr) variance Mixed-effects ML regression Number of obs = 18541 Group variable: pid Number of groups = 2508 Obs per group: min = 1 avg = 7.4 max = 15 Wald chi2(1) = 28.11 Log likelihood = -55286.004 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ hlghq1 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lnfihhmn | -.4015666 .075741 -5.30 0.000 -.5500162 -.253117 _cons | 14.40864 .5917387 24.35 0.000 13.24885 15.56843 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ pid: Unstructured | var(lnfihhmn) | 2.073304 .3231129 1.527594 2.813961 var(_cons) | 144.5579 19.37199 111.1664 187.9793 cov(lnfihhmn,_cons) | -16.63265 2.488823 -21.51065 -11.75465 -----------------------------+------------------------------------------------ var(Residual) | 18.10746 .2092684 17.70191 18.5223 ------------------------------------------------------------------------------ LR test vs. linear regression: chi2(3) = 5099.77 Prob > chi2 = 0.0000 Random slope Random intercept Covariation between random intercept and random slope

  23. Post estimation predictions Stata output

  24. Post estimation predictions – random coefficients . predict re_slope re_int, reffects +----------------------------------+ | pid re_int re_slope | |----------------------------------| 1. | 10019057 -4.833731 .4040526 | 4. | 10028005 -5.241494 .3758182 | 16. | 10042571 -1.442705 .1409662 | 17. | 10051538 -2.836288 .305106 | 35. | 10059377 5.487209 -.5189736 | +----------------------------------+

  25. Predicted individual regression lines . gen intercept= _b[_cons] + re_int . gen slope = _b[lnfihhmn] + re_slope +---------------------------------+ | pid intercept slope | |---------------------------------| | 10019057 9.574909 .002486 | | 10028005 9.167147 -.0257484 | | 10042571 12.96594 -.2606004 | | 10051538 11.57235 -.0964606 | | 10059377 19.89585 -.9205403 | +---------------------------------+

  26. V ariance explained by Total variance at each level predictors V ariance due to random intercept R emaining V ariance due unexplained to random variance slopes R emaining unexplained variance R emaining unexplained variance Partitioning unexplained variance in a random coefficients model

  27. Calculating the variance partition coefficient • Random intercepts model Between variance Total variance i.e. Between + Within

  28. Calculating the variance partition coefficient • Random slopes model • At the intercept, = 0 So at the intercept, the VPC for the random slopes model reduces to the same as the random intercepts model

  29. Variance partition coefficient for our example Tentative interpretation: least variability in GHQ for those on average incomes

  30. Random coefficients models Categorical outcomes

  31. Random coefficients model for binary data where βkis themean coefficient or fixed effect of covariate k bik is a subject-specific random deviation from mean coefficient uiis a subject-specific random intercept with mean zero

  32. Worked example • Random 20% sample • 15 waves of BHPS • Ages 21 to 59 • Outcome: GHQ binary scores (psychological morbidity cases: hlghq2 > 2) • Explanatory variable: employment status (jbstat recoded to employed/unemployed/olf)

  33. Logistic random coefficients example where yij = binary GHQ score for subject i,j = 1,…, J xij = employment status in wave j β1 = mean slope bi = subject-specific random deviation from mean slope ui = subject-specific random intercept

  34. Logistic random coefficients model Stata output

  35. . xtmelogit ghq unemp olf || pid: unemp olf,variance cov(unstr)

  36. No constant term with odds ratios No random residual Because logit model

  37. Random coefficients models for development over time

  38. Growth curve models • Models change over time as a continuous trajectory • Suitable for research questions such as • What is the trajectory for the population? • Are there distinct trajectories for each respondent? • If individuals have distinct trajectories, what variables predict these individual trajectories?

  39. Linear growth curve model Individual growth curves t = 0 at baseline and 1,2,3 ….,T in successive waves Mean population growth curve

  40. Worked example • Random 20% sample from BHPS • Waves 1 - 15 • All respondents over 16 years • Outcome: self-rated health (hlstat) • 5 point Likert scale with higher scores indicating poorer health • Linear growth function

  41. Slope (change in health over time) Intercept (mean health at baseline)

  42. Individual differences in health change Individual differences in baseline health

  43. Adding time invariant covariates

  44. Interacting gender with time

  45. Adding time varying covariates

  46. Beyond linear change • Polynomial trajectories • Quadratic or cubic trajectories • Piecewise linear trajectories • Exponential trajectories

  47. Non linear growth curves

More Related