1 / 37

PROJECTS ARE DUE

PROJECTS ARE DUE. By midnight, Friday, May 19 th Electronic submission only to tl o uis@jhsph.edu Please name the file: [myname]-project.[filetype] or [name1_name2]-project.[filetype]. Efficiency-Robustness Trade-offs.

mickey
Download Presentation

PROJECTS ARE DUE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PROJECTS ARE DUE By midnight, Friday, May 19th Electronic submission only to tlouis@jhsph.edu Please name the file: [myname]-project.[filetype] or [name1_name2]-project.[filetype] BIO656--Multilevel Models

  2. Efficiency-Robustness Trade-offs • First, we consider alternatives to the Gaussian distribution for random effects • Then, we move to issues of weighting, starting with some formalism • Then, move to an example of informative sample size • And, finally give a basic example that has broad implications of choosing among weighting schemes BIO656--Multilevel Models

  3. Alternatives to the Gaussian Distribution for Random Effects BIO656--Multilevel Models

  4. The t-distribution Broader tails than the Gaussian So, shrinks less for deviant Y-values The t-prior allows “outlying” parameters and so a deviant Y is not so indicative of a large, level 1 residual BIO656--Multilevel Models

  5. Creating a t-distribution • Assume a Gaussian sampling distribution, • Using the sample standard deviation produces the t-distribution • Z is t with a large df • t3 is the most different from Z for t-distributions with a finite variance BIO656--Multilevel Models

  6. BIO656--Multilevel Models

  7. With a t-prior, B is B(Y), increasing with |Y - | BIO656--Multilevel Models

  8. (1-B) = ½ = 0.50 Z is distance from the center BIO656--Multilevel Models

  9. (1- B) = 2/3 = 0.666 Z is distance from the center BIO656--Multilevel Models

  10. Estimated Gaussian & Fully Non-parametric priors for the USRDS data BIO656--Multilevel Models

  11. USRDS estimated Priors BIO656--Multilevel Models

  12. BIO656--Multilevel Models

  13. BIO656--Multilevel Models

  14. BIO656--Multilevel Models

  15. BIO656--Multilevel Models

  16. BIO656--Multilevel Models

  17. Informative Sample Size(Similar to informative Censoring)See Louis et al. SMMR 2006 BIO656--Multilevel Models

  18. BIO656--Multilevel Models

  19. BIO656--Multilevel Models

  20. BIO656--Multilevel Models

  21. BIO656--Multilevel Models

  22. BIO656--Multilevel Models

  23. BIO656--Multilevel Models

  24. BIO656--Multilevel Models

  25. Choosing among weighting schemes“Optimality” versus goal achievement BIO656--Multilevel Models

  26. Inferential Context Question What is the average length of in-hospital stay? A more specific question • What is the average length of stay for: • Several hospitals of interest? • Maryland hospitals? • All hospitals? • ....... BIO656--Multilevel Models

  27. “Data” Collection & Goal Data gathered from 5 hospitals • Hospitals are selected by some method • nhosp patient records are sampled at random • Length of stay (LOS) is recorded Goal is to: Estimate the “population” mean BIO656--Multilevel Models

  28. Procedure • Compute hospital-specific means • “Average” them • For simplicity assume that the population variance is known and the same for all hospitals How should we compute the average? • Need a goal and then a good/best way to combine information BIO656--Multilevel Models

  29. “DATA” BIO656--Multilevel Models

  30. Weighted averages & Variances (Variances are based on FE not RE) Each weighted average is mean = Reciprocal variance weights minimize variance Is that our goal? BIO656--Multilevel Models

  31. There are many weighting choices and weighting goals • Minimize variance by using reciprocal variance weights • Minimize bias for the population mean by using population weights (“survey weights”) • Use policy weights (e.g., equal weighting) • Use “my weights,” ... BIO656--Multilevel Models

  32. General Setting When the model is correct All weighting schemes estimate the same quantities same value for slopes in a multiple regression So, it is clearly best to minimize variance by using reciprocal variance weights When the model is incorrect Must consider analysis goals and use appropriate weights Of course, it is generally true that our model is not correct! BIO656--Multilevel Models

  33. Weights and their properties • But if m1 = m2 = m3 = m4 =m5 = m = then all weighted averages estimate the population mean: =  kk So, it’s best to minimize the variance But, if the hospital-specific mk are not all equal, then • Each set of weights estimates a different target • Minimizing variance might not be “best” • For an unbiased estimate of setwk = pk BIO656--Multilevel Models

  34. The variance-bias tradeoff General idea Trade-off variance & bias to produce low Mean Squared Error (MSE) MSE = Expected(Estimate - True)2 = Variance + (Bias)2 • Bias is unknown unless we know the mk (the true hospital-specific mean LOS) • But, we can study MSE (m, w, p) • In practice, make some “guesses” and do sensitivity analyses BIO656--Multilevel Models

  35. Variance, Bias and MSE as a function of (the ms, w, p) • Consider a true value for the variation of the between hospital means (* is the “overall mean”) T = (k - *)2 • Study BIAS, Variance, MSE for weights that optimize MSE for an assumed value (A) of the between-hospital variance • So, when A = T, MSE is minimized by this optimizer • In the following plot, A is converted to a fraction of the total variance A/(A + within-hospital) • Fraction = 0  minimize variance • Fraction = 1  minimize bias BIO656--Multilevel Models

  36. The bias-variance trade-offX-axis is assumed variance fractionY is performance computed under the true fraction Assumed k BIO656--Multilevel Models

  37. Summary • Much of statistics depends on weighted averages • Weights should depend on assumptions and goals • If you trust your (regression) model, • Then, minimize the variance, using “optimal” weights • This generalizes the equal m case • If you worry about model validity (bias for mp), • You can buy full insurance by using population weights • But, you pay in variance (efficiency) • So, consider purchasing only the insurance you need by using compromise weights BIO656--Multilevel Models

More Related