150 likes | 157 Views
This presentation discusses the sample design of the National Health Interview Survey (NHIS), including cost considerations and estimating variance. It emphasizes the use of weights and complex design variance estimation software.
E N D
National Health Interview Survey (NHIS) Sample Design, Weighting, and Variance Estimation Chris Moriarity U.S. General Accounting Office moriarityc@gao.gov NCHS Data Users Conference July 16, 2002
Presentation Outline • The NHIS sample • Is it "random"? • Cost / administrative considerations that influence the sample design • Estimates from NHIS data • Use of weights • Accounting for sampling variability
The NHIS has a "complex" sample design • The NHIS is a random sample, but not a "simple random sample" • The sample design has features to control costs and meet specified analytic goals
NHIS sample design features • Multistage sample selection process, beginning with geographic areas such as counties • County populations vary from a few hundred to several million • Quality of survey estimates improved if population variability is taken into account during sample selection
NHIS sample design features (continued) • Sampling geographic areas helps to control survey costs • Travel, administration costs decreased for personal visit interviewing, relative to a simple random sample • Sample is designed to give improved estimates for several groups (Black persons, Hispanic persons)
NHIS sample weights • Data from all "random" or "probability" sample surveys have weights, resulting from sample selection probabilities • The NHIS weights vary, due primarily to "oversampling" Black and Hispanic persons (to give improved estimates)
NHIS sample weights (continued) • NHIS weights contain factors for sampling and nonresponse, and a ratio adjustment using independent population estimates from the U.S. Census Bureau
Pitfalls of not using the NHIS sample weights • Unweighted estimates of totals would be too small • Unweighted estimates of rates and other ratios could be distorted; unweighted sample proportions of certain groups (e.g., Black persons) would be different from the corresponding U.S. population proportions
Assessing variability in NHIS estimates • NHIS estimates are based on a sample, and thus contain sampling variability • Estimation of sampling variability must be consistent with complex survey design to be valid
Complex survey features that affect variability estimation • Partitioning of the sampling universe into chunks (primary sampling units) that are assigned to groups (sampling strata) • Clustering of sample cases within primary sampling units
Consequences of ignoring NHIS sample design features (sampling strata, primary sampling units) when doing variance estimation • Variance estimates probably would be too small • “Degrees of freedom” for hypothesis tests, etc. would be too large
Estimating variances using NHIS public use file files • For 1997 and more recent years, refer to appendix in NHIS survey description document • For earlier years, refer to variance estimation documentation in publications and at NHIS web site http://www.cdc.gov/nchs/nhis.htm
General reference for variance estimation software http://www.fas.harvard.edu/~stats/ survey-soft/survey-soft.html Provides descriptions and links to software packages, along with reviews
References for the NHIS sample design: NCHS publications Design and Estimation for the National Health Interview Survey, 1995-2004. Series 2, No. 130 (2000) National Health Interview Survey: Research for the 1995-2004 design. Series 2, No. 126 (1999) Both are available at NCHS website
Summary • The NHIS is a random sample • Sample is geographically clustered to reduce personal interview costs • Weights vary because Black and Hispanic persons are oversampled • Estimates from NHIS data • Sampling weights should be used • Should use complex design variance estimation software