150 likes | 246 Views
Comments: The Big Picture for Small Areas. Alan M. Zaslavsky Harvard Medical School. Thanks to presenters. 3 interesting talks Raise significant policy issues. Voting rights tabulation. Generic approach for beta-binomial modeling Shrinkage calculations (R. Little)
E N D
Comments: The Big Picture for Small Areas Alan M. ZaslavskyHarvard Medical School
Thanks to presenters • 3 interesting talks • Raise significant policy issues
Voting rights tabulation • Generic approach for beta-binomial modeling • Shrinkage calculations (R. Little) • Approach to quasi-Bayesian estimation for clustered survey data (D. Malec) • Why jurisdictional classes rather than prior centered on prediction? • Use of classes predictably biases up or down just above or below class boundary. • Problem of discreteness/thresholds
Voting rights tabulation • How ‘general purpose’ is the product? • Inference for point estimate of % • vs inference for P(>5%). • Presentation of results • Bayes methods → posterior distributions • Present results for multiple inferences? • SAE of aggregates ≠ aggregate of SAEs • Perils of thresholds/discreteness
“Context specificity” • What does it add beyond predictive variance? • Model error worse than a sampling error – why? • Might be better understood as a measure of model-robustness. • Might not have unambiguous definition • In lead example, should precision of NHIS or BRFSS data define ‘specificity’? (NHIS-BRFSS association is a model estimate.) • Depends on which inference: Estimate of absolute levels sensitive to calibration Estimate of differences/ranking among areas unaffected by calibration
“Context specificity” • Highlights value of transparency of methodology • Develop heuristic explanations of components contributing to estimation and their ‘weights’ • “For estimation of XXX … • “Total (predictive) SE is … • “XX% from sampling in BRFSS … • “YY% from estimation of NHIS calibration model… • “ZZ% from model error of covariate model…”
Outcome screening • Prioritizing more global SAE program • Technical concerns • Do methods properly account for sampling variance of domain proportions? • In this 2-level model, why use ad hocmethods for level-2 variance estimation? • Strategic concerns • Consider costs & benefits as well as variances • Posterior ranking Є {overkill} ? • Consider families of outcomes, not just individual outcomes • e.g. 12 binomial variables, likely related, for same Asian population
Current state of SAE • Typically one variable or a few closely related • Relationships only as explicitly selected for models • Not higher-order interactions • Each major SAE a major project • High-level statistical expertise involved • Takes a long time • Lack of fully generic methods • (… although principles fairly well established) • Depends on amount & structure of available data, distributions & relationships, etc. • Often new methods required for each project
Path that extends current methods • More estimation projects • Elaborate more generic methods • Adapt to various data structures • More use of multilevel structure • Still univariate or low-dimensional • OK for many… • single-purpose surveys • health care applications (“profiling”)
Some goals for general-purpose surveys • Generate SAE for all current products • Detailed cross-tabulations • Microdata • Plausible (not “correct”) for all relationships • Valid presentation of uncertainty • Consistency of all products • Margins and aggregation of estimates
What might this look like? • Almost certainly requires some form of microdata synthesis • Yields consistency • Units that look ‘enough’ like real units • Two approaches • “Bottom up” synthesisof units (persons, households) • “Top down” imposition of constraints on synthetic samples of real units
Advantages of ‘top-down’ approach • Building from observed units makes high-order interactions realistic • Otherwise most difficult to model • Impose constraints via weighting or constrained resampling • Weighting is like predictive mean estimation; properties more readily controllable properties • Constraints may be from direct estimates, SAE, purely predictive estimates • Uncertainty via stochastic prediction of constraints and MI
Previous applications • Reweighting/Imputation of households for census undercount (Zaslavsky 1988, 1989) • Reweighting for food stamp microsimulations • “Large numbers of estimates for small areas” (Schirm & Zaslavsky 1997-2002) • High-order interactions crucial to simulation of program provisions • Reweight national CPS data to simulate each state in turn (direct and SAE controls)
Synthesis • Work will proceed on many fronts • Develop and integrate new data sources • Targeted SAE projects responsive to needs • Advances in dissemination & explication • Integrate improvements in SAE for marginal (single-variable) estimates into overall synthetic framework.