180 likes | 337 Views
Sampling for a Highly Skewed Population: Sample Design for the National Survey of Residential Care Facilities Margie Byron, Joshua Wiener, John Loft, Vince Iannacchione and Angela Greene RTI International International Conference on Establishment Surveys III June 21, 2007 Montreal, Que.
E N D
Sampling for a Highly Skewed Population: Sample Design for the National Survey of Residential Care FacilitiesMargie Byron, Joshua Wiener, John Loft, Vince Iannacchione and Angela GreeneRTI InternationalInternational Conference on Establishment Surveys IIIJune 21, 2007 Montreal, Que. RTI International is a trade name of Research Triangle Institute 3040 Cornwallis Road ■ P.O. Box 12194 ■ Research Triangle Park, North Carolina, USA 27709 Phone 919-541-6086 Fax 919-541-6086 e-mail mzb@rti.org
Introduction • National Survey of Residential Care Facilities (NSRCF) to be conducted in early 2009 • Joint initiative of the Office of the Assistant Secretary for Planning and Evaluation (ASPE), the National Center for Health Statistics (NCHS), and the Agency for Healthcare Research and Quality (AHRQ) • Very little nationally representative data available on residential care facilities (RCFs)
What are Residential Care Facilities? • There is no commonly used definition • The terms used for these types of residences vary across states in the U.S. • Residential care facilities • Assisted living facilities • Homes for the aged • Board and care homes • Congregate care facilities
Goals of NSRCF • General purpose survey of residential care facilities (RCFs) • How many RCFs are in the U.S. and what are their characteristics? • How many people reside in RCFs and what are their characteristics? • Want sufficient sample sizes and power to perform comparative analyses at the facility and resident levels
RCF Eligibility Criteria • Provide care to predominantly older population (age 65+ years old) • State licensed or regulated • Licensed to contain 4+ beds • Provide room and board and 2+ meals/day • Provide 24 hour/7 day on-site supervision • Provide assistance with personal care and/or health related services • Nursing homes and retirement communities are not eligible.
Sample Design Challenges • Want sufficient sample sizes and power for both facility and resident level comparative analyses • Want to conduct in-person interviews with facility staff about the facility and its residents • Keep estimated data collection costs within specified budget amount • Higher costs to add an additional facility to the study compared with adding an additional resident within a facility
Estimated Distributions Note: Table total excludes 21,583 facilities in the SSS data file where bed size is missing (36.4% of 59,304 facilities on the file). Data Source: Social and Statistical Systems, Inc. sampling frame data file; compiled 2003
Sample Design Options • Stratified random sample by bed size • Probability proportional to size (PPS) random sample with bed size used to calculate size measure • Stratified PPS using bed size for stratification and to calculate size measures
Sample Size and Power Simulations • Selected 10 samples of RCFs under each sample design option and various sample sizes to estimate design effects • Determined number of RCFs needed to achieve desired precision requirements • Used equal and unequal subgroup comparison tests • H0: p1=p2 • Prevalence rate of 0.50 for subgroup 1
Design Effect Simulation Results Note: Assumptions: alpha=0.05, power=80%, prevalence of characteristic in subgroup1= 0.50. Design effects estimates based on sample selection simulations conducted on the SSS sampling frame data. Source: RTI analysis of SSS data.
Power for Resident Group Comparisons Note: Assumptions: alpha=0.05, power=80%, prevalence of characteristic in subgroup1= 0.50. Design effects estimates based on sample selection simulations conducted on the SSS sampling frame data. Design effect calculations include an intracluster correlation of 0.01. Source: RTI analysis of SSS data.
Conclusions • Access to preliminary sampling frame data containing characteristics of the target population could be very useful in determining optimal sample design, even if the frame does not provide complete data for the whole target population. • The higher costs associated with adding one more facility to the sample, compared to adding one more resident to the sample, along with power requirements, caused us to focus more on finding an optimal design for facility level analysis that would not sacrifice power of the resident level analysis.
Conclusions • It was a complicated, iterative process to balance sample size and power criteria with data collection costs. • The population of RCFs is very dynamic. The analysis will be repeated once the final sampling frame for the NSRCF is constructed to see if any changes should be made to the optimal bed size stratification cutoffs for the sample design.