370 likes | 443 Views
PROJECTS ARE DUE. By midnight, Friday, May 19 th Electronic submission only to tl o uis@jhsph.edu Please name the file: [myname]-project.[filetype] or [name1_name2]-project.[filetype]. Efficiency-Robustness Trade-offs.
E N D
PROJECTS ARE DUE By midnight, Friday, May 19th Electronic submission only to tlouis@jhsph.edu Please name the file: [myname]-project.[filetype] or [name1_name2]-project.[filetype] BIO656--Multilevel Models
Efficiency-Robustness Trade-offs • First, we consider alternatives to the Gaussian distribution for random effects • Then, we move to issues of weighting, starting with some formalism • Then, move to an example of informative sample size • And, finally give a basic example that has broad implications of choosing among weighting schemes BIO656--Multilevel Models
Alternatives to the Gaussian Distribution for Random Effects BIO656--Multilevel Models
The t-distribution Broader tails than the Gaussian So, shrinks less for deviant Y-values The t-prior allows “outlying” parameters and so a deviant Y is not so indicative of a large, level 1 residual BIO656--Multilevel Models
Creating a t-distribution • Assume a Gaussian sampling distribution, • Using the sample standard deviation produces the t-distribution • Z is t with a large df • t3 is the most different from Z for t-distributions with a finite variance BIO656--Multilevel Models
With a t-prior, B is B(Y), increasing with |Y - | BIO656--Multilevel Models
(1-B) = ½ = 0.50 Z is distance from the center BIO656--Multilevel Models
(1- B) = 2/3 = 0.666 Z is distance from the center BIO656--Multilevel Models
Estimated Gaussian & Fully Non-parametric priors for the USRDS data BIO656--Multilevel Models
USRDS estimated Priors BIO656--Multilevel Models
Informative Sample Size(Similar to informative Censoring)See Louis et al. SMMR 2006 BIO656--Multilevel Models
Choosing among weighting schemes“Optimality” versus goal achievement BIO656--Multilevel Models
Inferential Context Question What is the average length of in-hospital stay? A more specific question • What is the average length of stay for: • Several hospitals of interest? • Maryland hospitals? • All hospitals? • ....... BIO656--Multilevel Models
“Data” Collection & Goal Data gathered from 5 hospitals • Hospitals are selected by some method • nhosp patient records are sampled at random • Length of stay (LOS) is recorded Goal is to: Estimate the “population” mean BIO656--Multilevel Models
Procedure • Compute hospital-specific means • “Average” them • For simplicity assume that the population variance is known and the same for all hospitals How should we compute the average? • Need a goal and then a good/best way to combine information BIO656--Multilevel Models
“DATA” BIO656--Multilevel Models
Weighted averages & Variances (Variances are based on FE not RE) Each weighted average is mean = Reciprocal variance weights minimize variance Is that our goal? BIO656--Multilevel Models
There are many weighting choices and weighting goals • Minimize variance by using reciprocal variance weights • Minimize bias for the population mean by using population weights (“survey weights”) • Use policy weights (e.g., equal weighting) • Use “my weights,” ... BIO656--Multilevel Models
General Setting When the model is correct All weighting schemes estimate the same quantities same value for slopes in a multiple regression So, it is clearly best to minimize variance by using reciprocal variance weights When the model is incorrect Must consider analysis goals and use appropriate weights Of course, it is generally true that our model is not correct! BIO656--Multilevel Models
Weights and their properties • But if m1 = m2 = m3 = m4 =m5 = m = then all weighted averages estimate the population mean: = kk So, it’s best to minimize the variance But, if the hospital-specific mk are not all equal, then • Each set of weights estimates a different target • Minimizing variance might not be “best” • For an unbiased estimate of setwk = pk BIO656--Multilevel Models
The variance-bias tradeoff General idea Trade-off variance & bias to produce low Mean Squared Error (MSE) MSE = Expected(Estimate - True)2 = Variance + (Bias)2 • Bias is unknown unless we know the mk (the true hospital-specific mean LOS) • But, we can study MSE (m, w, p) • In practice, make some “guesses” and do sensitivity analyses BIO656--Multilevel Models
Variance, Bias and MSE as a function of (the ms, w, p) • Consider a true value for the variation of the between hospital means (* is the “overall mean”) T = (k - *)2 • Study BIAS, Variance, MSE for weights that optimize MSE for an assumed value (A) of the between-hospital variance • So, when A = T, MSE is minimized by this optimizer • In the following plot, A is converted to a fraction of the total variance A/(A + within-hospital) • Fraction = 0 minimize variance • Fraction = 1 minimize bias BIO656--Multilevel Models
The bias-variance trade-offX-axis is assumed variance fractionY is performance computed under the true fraction Assumed k BIO656--Multilevel Models
Summary • Much of statistics depends on weighted averages • Weights should depend on assumptions and goals • If you trust your (regression) model, • Then, minimize the variance, using “optimal” weights • This generalizes the equal m case • If you worry about model validity (bias for mp), • You can buy full insurance by using population weights • But, you pay in variance (efficiency) • So, consider purchasing only the insurance you need by using compromise weights BIO656--Multilevel Models