150 likes | 491 Views
National Health Interview Survey (NHIS) Sample Design, Weighting, and Variance Estimation. Chris Moriarity U.S. General Accounting Office moriarityc@gao.gov NCHS Data Users Conference July 16, 2002. Presentation Outline. The NHIS sample Is it "random"?
E N D
National Health Interview Survey (NHIS) Sample Design, Weighting, and Variance Estimation Chris Moriarity U.S. General Accounting Office moriarityc@gao.gov NCHS Data Users Conference July 16, 2002
Presentation Outline • The NHIS sample • Is it "random"? • Cost / administrative considerations that influence the sample design • Estimates from NHIS data • Use of weights • Accounting for sampling variability
The NHIS has a "complex" sample design • The NHIS is a random sample, but not a "simple random sample" • The sample design has features to control costs and meet specified analytic goals
NHIS sample design features • Multistage sample selection process, beginning with geographic areas such as counties • County populations vary from a few hundred to several million • Quality of survey estimates improved if population variability is taken into account during sample selection
NHIS sample design features (continued) • Sampling geographic areas helps to control survey costs • Travel, administration costs decreased for personal visit interviewing, relative to a simple random sample • Sample is designed to give improved estimates for several groups (Black persons, Hispanic persons)
NHIS sample weights • Data from all "random" or "probability" sample surveys have weights, resulting from sample selection probabilities • The NHIS weights vary, due primarily to "oversampling" Black and Hispanic persons (to give improved estimates)
NHIS sample weights (continued) • NHIS weights contain factors for sampling and nonresponse, and a ratio adjustment using independent population estimates from the U.S. Census Bureau
Pitfalls of not using the NHIS sample weights • Unweighted estimates of totals would be too small • Unweighted estimates of rates and other ratios could be distorted; unweighted sample proportions of certain groups (e.g., Black persons) would be different from the corresponding U.S. population proportions
Assessing variability in NHIS estimates • NHIS estimates are based on a sample, and thus contain sampling variability • Estimation of sampling variability must be consistent with complex survey design to be valid
Complex survey features that affect variability estimation • Partitioning of the sampling universe into chunks (primary sampling units) that are assigned to groups (sampling strata) • Clustering of sample cases within primary sampling units
Consequences of ignoring NHIS sample design features (sampling strata, primary sampling units) when doing variance estimation • Variance estimates probably would be too small • “Degrees of freedom” for hypothesis tests, etc. would be too large
Estimating variances using NHIS public use file files • For 1997 and more recent years, refer to appendix in NHIS survey description document • For earlier years, refer to variance estimation documentation in publications and at NHIS web site http://www.cdc.gov/nchs/nhis.htm
General reference for variance estimation software http://www.fas.harvard.edu/~stats/ survey-soft/survey-soft.html Provides descriptions and links to software packages, along with reviews
References for the NHIS sample design: NCHS publications Design and Estimation for the National Health Interview Survey, 1995-2004. Series 2, No. 130 (2000) National Health Interview Survey: Research for the 1995-2004 design. Series 2, No. 126 (1999) Both are available at NCHS website
Summary • The NHIS is a random sample • Sample is geographically clustered to reduce personal interview costs • Weights vary because Black and Hispanic persons are oversampled • Estimates from NHIS data • Sampling weights should be used • Should use complex design variance estimation software