1 / 44

PART 8

PART 8. Two Stage & Joint Models. SEERMED DATA . Motivation:. End of Life Colorectal Cancer Costs. $500,000. $0. Expenditure. Professional Health-Care Services. HMO. Hospice. FFS. Medicare. Private Ins. Rejected Allowed Co-Pay Deductibles. Data.

brandi
Download Presentation

PART 8

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PART 8 Two Stage & Joint Models BIO656--Multilevel Models

  2. SEERMED DATA Motivation: End of Life Colorectal Cancer Costs $500,000 $0 Expenditure BIO656--Multilevel Models

  3. Professional Health-Care Services HMO Hospice FFS Medicare Private Ins. Rejected Allowed Co-Pay Deductibles Data Factors: Need-based Enabling Predisposing Patient – Physician Cancer Diagnosis Claims Terminal-Phase Costs 12 mos Medicare Payments BIO656--Multilevel Models Death

  4. Data Patient – Physician Cancer Diagnosis Medicare Payments Terminal-Phase Costs 12 mos 3 mos BIO656--Multilevel Models Death

  5. SEERMED DATA Motivation: End of Life Colorectal Cancer Costs $500,000 $0 Expenditure BIO656--Multilevel Models

  6. A “Normal” Distribution Density Y BIO656--Multilevel Models

  7. A Complex Distribution Density Y BIO656--Multilevel Models

  8. Complex Distributions  Mixtures of Simple Distributions Mixtures-of-Experts Models (MEM) Finite Mixture Models (FMM) Density Y McLachlan, Peel. (2001), FMM BIO656--Multilevel Models Jacobs, Jordan. (1991), MEM, Neural Comp

  9. A simple, two-part mixture $0 1. P(Y>0) $+ 2. E(Y|Y>0) E(Y+) BIO656--Multilevel Models

  10. A Two-Part Model:(Intensity & Size) IS – logit/lognormal 1. logit{ Pr(Yi>0) } = x 2. i.) log10(Yi+) = x + i ii.) i ~ N(0,2) 0. “Tobit” model: Tobin (1958) 1. Selection (hurdle) models: (Amemiya 1984; Heckman 1976) 2. Zero-inflated models (Lambert 1992; Green 1994) 3. Two-part models (Manning 1981; Mullahy 1998) BIO656--Multilevel Models

  11. Another Two-Part Model:(Intensity & Size) IS – Probit/log-Gamma 1. -1{ Pr(Yi>0) } = x 2. i.) log10{E( Yi+)} = x ii.)Yi+~ (,) BIO656--Multilevel Models

  12. A Two-Part Model:The Intensity-Size GLM IS – GLM h1 binary data link function h2 continuous data link function f exponential family w/ dispersion  BIO656--Multilevel Models

  13. Multiple Levels 1 0 + BIO656--Multilevel Models

  14. Month 12 Monthly SEERMED Data Month 11 12 10 Month 10 11 12 + 10 11 + + BIO656--Multilevel Models

  15. Multiple Levels 2 0 0 + + Time X X X X X X HMREM1 Month 12 f12 g1 g2 Month 11 f11 a 0 + g1 g2 Month 10 f10 a b g1 g2 b BIO656--Multilevel Models

  16. A 2-Part Model • Intensity: logit( i) = x • Size: • i = x • Yi+ ~ f( i, ) BIO656--Multilevel Models

  17. ui= ~ N,  = ai0aa bi0babb A Longitudinal 2-Part Model • Intensity: logit( ic) = x+ zai • Size: • ic = x + zbi • Yi+c~ f( ic, ) 1. Olsen, Schafer, (2001) 2. Tooze, Grunwald, Jones, (2002) 3. Yau, Lee, Ng, (2002) 3. Random Effects: BIO656--Multilevel Models

  18. Data Analysis: 3 General Steps • Exploration • Model Fitting and Estimation • Diagnostics and the greatest of these is… BIO656--Multilevel Models

  19. Uncooked Spaghetti Plot BIO656--Multilevel Models

  20. Month 12 Monthly SEERMED Data Month 11 12 10 Month 10 11 12 + 10 11 + + BIO656--Multilevel Models

  21. Month 10 & Month 11 log10(Costs) Bivariate Point Mass Bivariate Continuous Distb. Univariate Continuous Distbs. Figure 5: Seermed log10 month 1 & 2 Density 0 0 Expenditure 11 Expenditure 10 BIO656--Multilevel Models 5 5

  22. bb aa ba PRISM plot: Month 10 & 11 SEERMED Costs Paired Response Intensity Size Mixture plot BIO656--Multilevel Models

  23. PRISM Matrix: Months 10-12 BIO656--Multilevel Models

  24. Intensity: Probit, Logistic Size: Lognormal, Gamma ui= ~ N,  = ai0a bi0bab 2 2 SEERMED MREM • Intensity: h1( ic) = 0+1Obs+2Male+3Obs*Male+ ai • Size: • h2( ic) = 0 + 1Obs + 2Male + bi • Yi+c~ f( ic, ) • Random Effects: BIO656--Multilevel Models

  25. Likelihood: Li() Estimation Whoa. But: Non-Linear Mixed Model (NLMM) • PQL, MCEM, MCMC, … • Adaptive Quadrature – Newton-Raphson Zeger, Karim (1991); Davidian, Giltinan, (1993); Pinheiro, Bates (1995); Mcculloch (1997); Booth et al. (2001); Rabe-Hesketh, et al. (2004) BIO656--Multilevel Models

  26. Estimation: SAS procnlmixed data=SEERMED; parms / data=parms_start; *- 1) logistic: logit{Pr( Y>0 | a )} = Xalpha + a = “eta0” -*; eta0 = alpha0_c + alpha1_c*obs + alpha2_c*male + alpha3_c*obsmale + a; pi_c = exp(eta0) / (1+exp(eta0)); *- 2) log-normal: E( log(Y) | Y>0, b ) = XB + b = “eta1” -*; eta1 = beta0_c + beta1_c*obs + beta2_c*male + b; *- log-likelihood -*; pi=CONSTANT('PI'); if y=0 then ll1 = 0; else ll1=-.5*log(2*pi*sigma**2)-.5*((log10y-eta1)/sigma)**2; ll = (1-Gpos)*log(1-pi_c) + Gpos*log(pi_c) + Gpos*(ll1); model y ~ GENERAL(ll); RANDOM a b ~ NORMAL([0,0],[tau_aa, tau_ba, tau_bb]) SUBJECT=id; run; BIO656--Multilevel Models

  27. Estimation: SAS (better) procnlmixed data=sanfran qpoints=10; parms / data=parms_start; *-logit-*; eta0 = alpha0_c + alpha1_c*obs + alpha2_c*male + alpha3_c*obsmale + a; expeta = exp(eta0); pi_c = expeta / (1+expeta); tau_aa = exp(logtau_a)**2; *-lognormal-*; eta1 = beta0_c + beta1_c*obs + beta2_c*male + b; phi = 10**(log10phi); *std dev of log10(Y+1)|b; tau_bb = (10**(log10tau_b))**2; *- RE Var -*; rho_ba = (exp(2*zrho_ba) - 1) / (exp(2*zrho_ba) + 1); tau_ba = rho_ba*(tau_aa*tau_bb)**.5; *- log-likelihood -*; pi=CONSTANT('PI'); if y=0 then ll1 = 0; else ll1=-.5*log(2*pi*phi**2)-.5*((log10y-eta1)/phi)**2; ll = (1-Gpos)*log(1-pi_c) + Gpos*log(pi_c) + Gpos*(ll1); model y ~ GENERAL(ll); RANDOM a b ~ NORMAL([0,0],[tau_aa, tau_ba, tau_bb]) SUBJECT=id; odsoutput ParameterEstimates = parms_new; run; BIO656--Multilevel Models

  28. SEERMED MREM Results 1 BIO656--Multilevel Models

  29. c MREM Profile Likelihood Plots for 3 Profile ll (alpha3) Probit*- Lognormal Probit*- Gamma Logit- Lognormal Scaled Profile Likelihood Logit- Gamma LR  6 BIO656--Multilevel Models c Intensity model Obs*Male interaction term (3)

  30. SEERMED MREM Results 2 BIO656--Multilevel Models

  31. bb aa ba PRISM plot: Month 10 & 11 SEERMED Costs Paired Response Intensity Size Mixture plot BIO656--Multilevel Models

  32. SEERMED MREM Results 2 But do these models fit?… BIO656--Multilevel Models

  33. Data vs. MREM Models Obs: ,Y BIO656--Multilevel Models Exp: P, L,G

  34. Diagnostic PRISM Matrix: lognormal IS-GLMM Residuals Expected Observed BIO656--Multilevel Models

  35. Diagnostic PRISM Matrix: lognormal IS-GLMM Residuals Expected Observed BIO656--Multilevel Models

  36. Review & Related Work MEM MREM HMREM HMMMM Ideas • Simple Combinations of Simple Models + 0 2. Complex (Multi-Level) Data: BIO656--Multilevel Models Many Models & Many Pictures 12

  37. Data vs. HMREM Models Data vs. HMMMM Models BIO656--Multilevel Models

  38. Review & Related Work • These ideas are not just for Zero-Inflated Data • Latent Variables are useful for “connecting” things BIO656--Multilevel Models

  39. Opportunistic Infection & IDU Always Users Interview: Reported Drug Use Intermittent Users Never Users Interview: Reported No Drug Use Opportunistic Infection Each Line Represents 1 subject’s time in the study BIO656--Multilevel Models Day in Study 6 months prior to 1st interview

  40. Death / Dropout But what about Possible Informative Missingness? Drug Use OI BIO656--Multilevel Models

  41. Jointly Analyze Survival & OIs 1) logistic model: logit{ Pr(OIij | ai) } = 0 + 1SUij + 2SUij*HCuseij + 3AUij+ 4Periodj + ai 2) Survival Model: log{ (t) } = 0 + 1SUij + 2AUij + ai 3) Latent Effects: ai ~ N(0,) Guo & Carlin (2004) BIO656--Multilevel Models

  42. Warning! • But “Buyer Beware” • -- Model Assumptions • -- Identifiability • -- Model Fit • -- Marginalize & Check whenever possible • MLMs require even more due-diligence than usual BIO656--Multilevel Models

  43. References • Mixture Models: • McLachlan, G. J. and Peel, D. (2001), Finite mixture models, John Wiley & Sons. • Jacobs, R. A. and Jordan, M. I. (1991), “Adaptive mixtures of local experts. Neural Computation,” Neural Computation, 3, 79–87. • Two-Part Models: • Tobin, J. (1958), “Estimation of Relationships for Limited Dependent Variables,” Econometrica, 25, 24–36. • Amemiya, T. (1984), “Tobit models: A survey,” Journal of Econometrics, 24, 3–61. • Heckman, J. (1976), “The common structure of statistical models of truncation, sample selection, and limited dependent variables, and a sample estimator for such models,” The Annals of Economic Development and Social Measurement, 5, 475–592. • Lambert, D. (1992), “Zero-inflated Poisson regression, with an application to defects in manufacturing,” Technometrics, 34, 1–14. • Green, W. (1994), “Accounting for excess zeros and sample selection in Poisson and negative binomial regression models,” Working Paper EC-94-10, Department of Economics, New York University • Manning, W., Newhouse, J., Orr, L., Duan, N., Keeler, E., Leibowitz, A., Marquis, M., and Phelps, C. (1981), “A two-part model of the demand for medical care: Preliminary results from the health insurance experiment,” in Health, Economics, and Health Economics, eds. van der Gaag, J. and Perlman, M., pp. 103–104. • Mullahy, J. (1998), “Much ado about two: reconsidering retransformation and the two part model in health economics,” Journal of Health Economics, 17, 247–281. BIO656--Multilevel Models

  44. References • Longitudinal 2-part models • Olsen, M. K. and Schafer, J. L. (2001), “A two-part random-effects model for semicontinuous longitudinal data,” Journal of the American Statistical Association, 96, 730–745. • Tooze, J. A., Gunward, G. K., and Jones, R. H. (2002), “Analysis of repeated measures data with clumping at zero,” Statistical Methods in Medical Research, 11, 341–355. • Yau, K. K. W., Lee, A. H., and Ng, A. S. K. (2002), “A zero-augmented gamma mixed model for longitudinal data with many zeros,” The Australian and New Zealand Journal of Statistics 44, 177–183. • Estimation: • Zeger, S. L. and Karim, M. R. (1991), “Generalized linear models with random effects: A Gibbs sampling approach,” Journal of the American Statistical Association, 86, 79–86. • Davidian, M. and Giltinan, D. M. (1993), “Some general estimation methods for nonlinear mixed-effects models,” Journal of Biopharmaceutical Statistics, 3, 23–55. • Pinheiro, J. C. and Bates, D. M. (1995), “Approximations to the log-likelihood function in the nonlinear mixed-effects model,” Journal of Computational and Graphical Statistics,4, 12–35. • McCulloch, C. E. (1997), “Maximum likelihood algorithms for generalized linear mixed models,” Journal of the American Statistical Association, 92, 162–170. • Booth, J. G., Hobert, J. P., and Jank, W. (2001), “A survey of Monte Carlo algorithms for maximizing the likelihood of a two-stage hierarchical model,” Statistical Modelling: An International Journal, 1, 333–349. • Rabe-Hesketh, S., Skrondal, A., and Pickles, A. (2004), “Maximum likelihood estimation of limited and discrete variable models with nested random effects,” Journal of Econometrics, in press. • Other: • Guo, X. and Carlin, B.P. (2004), ``Separate and Joint Modeling of Longitudinal and Event Time Data Using Standard Computer Packages," The American Statistician, 58 16--24. BIO656--Multilevel Models

More Related