230 likes | 601 Views
Multiple Frame Surveys. Tracy Xu Kim Williamson Department of Statistical Science Southern Methodist University. Multiple Frame Surveys. Introduction – What is Multiple Frame Survey Different estimators for population total Variance Estimators for those estimators Conclusion
E N D
Multiple Frame Surveys Tracy Xu Kim Williamson Department of Statistical Science Southern Methodist University
Multiple Frame Surveys • Introduction – What is Multiple Frame Survey • Different estimators for population total • Variance Estimators for those estimators • Conclusion • References
Introduction • Hartley (1962) • Multiple frame surveys refers to two or more frames that can cover a target population • Very useful for sampling rare or hard-to-reach populations • Dual frame design may result in considerable cost savings over a single frame design with comparable precision
Example 1 – Cost Reduction • Agriculture [Hartley 1962, 1974] + List frame (incomplete, names, addresses) - Less costly + Area frame (complete, insensitive to changes) - Expensive to sample + Can achieve the same precision • Linear Cost Function C = nAcA + nBcB
Example 2 – Rare Populations • AIDS [Kalton and Anderson 1986] + Using a general population frame as well as std clinics, drug treatment centers, and hospitals • Homeless [Iachan and Dennis 1993] + Frames: homeless shelters, soup kitchens, and street areas • Alzheimer’s + Frames: general population and adult day-care centers
Issues to Consider • Statisticians must address the following issues + How should the information from the samples be combined to estimate population quantities? + How should variance estimates be calculated?
Notations • Universe U = AUB = a U ab U b • N=# of elements in the population NA= # of elements in Frame A NB= # of elements in Frame B Na = # of elements in Frame A, but not Frame B Nb = # of elements in Frame B, but not Frame A Nab = # of elements in Frame A & Frame B • SA = P{ ith element is in S} = πAi • Y = population total = Ya + Yb + Yab
Estimators • Hartley (H) • Fuller and Burmeister (FB) • Single Frame estimators • Pseudo-Maximum Likelihood (PML)
Hartley & FB Estimator • Minimizes the variance among the class of linear unbiased estimators of Y • Have minimum variance for a single response • Use different set of weights for each response variable • Disadvantages: Increased amount of calculations (uses covariances estimated by the data) and possible inconsistencies • Estimators are not in general linear functions of y • FB has the greatest asymptotic efficiency
Single Frame Estimators • Bankier (1986), Kalton & Anderson (1986) and Skinner (1991) • Treat all observations as if they had been sampled from a single frame with modified weights for observations in the intersections of frames • Do not use any auxiliary information about the population totals • Linear in y • Other techniques may be applied: Regression Estimation and Ranking Ratio Estimation
Pseudo-Maximum Likelihood Estimator • Skinner and Tao (1996) derived pseudo-ML(PML) estimator for dual frame survey that use the same set of weights for all items of y, similar to “single frame” estimators, and maintain efficiency. • The idea of pseudo-MLE estimation is talked about in Roberts, Rao, Kumar (1987) and Skinner, Holt, and Smith (1989) in which a MLE estimator under simple random sampling is modified to achieve consistent estimation under complex designs.
Pseudo-Maximum Likelihood Estimator • The main advantages of PMLE are that it is design consistent and typically has a simple form. • The potential disadvantage is that it may not be asymptotically efficient, although it may be hoped that any loss of efficiency will tend to be small in practice.
Pseudo-Maximum Likelihood Estimator • Pseudo-MLE of Y is derived as and is the smallest root of the quadratic equation
Comparison of All Estimators Extensive simulation was done to evaluate the performance of all the estimators in Sharon Lohr and J. N. K Rao(2005) paper • Findings: In all the simulations, the PML method had either the smallest EMSE or an EMSE close to the minimum value. With its high efficiency and ease of computation, as well as the practical advantage of using the same set of weights for all response variables, the PML method appears to be a good choice for estimation in multiple frame survey.
Comparison of All Estimators • Findings When Q>=3, the theoretically optimal Fuller-Burmeister and Hartley methods became unstable, because they require solving systems of equations using a large estimated covariance matrix.
Asymptotic Variance • Under some conditions, the H, FB and PML estimators are all consistent estimators of the total. • And But neither H estimator or PML estimator is necessarily more efficient than the other.
Asymptotic Variance • Sharon Lohr and J. N. K. Rao(2005) paper gives a general formula for the asymptotic variance for all above estimators, which can be used to construct optimal designs for multiple frame surveys.
Variance Estimators Two Methods: • Skinner and Rao(1996) described a method for estimating the variance of using Taylor linearization. • Lohr and Rao(2000) defined jackknife variance estimator for estimators from dual frame surveys and showed that jackknife variance estimator is asymptotically equivalent to the Taylor linearization variance estimator.
Variance Estimators • Simulation results ( Lohr and Rao 2000) showed that in comparing the linearization estimator, full jackknife and modified jackknife estimators • The jackknife estimator has exhibited smaller bias than the linearization estimator. • The relative bias of all three estimators of the variance tends to decrease as the sample size increase. • For the smaller sample sizes, the linearization and modified jackknife methods underestimate the EMSE. • Coverage probabilities, though similar for the three variance estimators, were slightly higher for the full jackknife.
Variance Estimators 5. The jackknife methods are less stable than the linearization estimator of the variance as judged by the values of relative standard error. 6. For single frame estimator, the jackknife and linearization estimates of the variance coincide. 7. For the other estimators, both the linearization and modified jackknife estimates of the variance are biased downward.
Conclusion • Multiple Frame Surveys can be extremely beneficial when sampling rare populations and when a complete frame is very expensive to sample • Different estimators of the total are proposed. Choice of estimators will depend on survey design and complexity: FB is the most efficient, however due to additional calculations and complexity PML may be preferred
References • H.O. Hartley (1974), “Multiple Frame Methodology and Selected Applications”, Sankhya, the Indian Journal of Statistics, Series C, 36, 99-118. • C. J. Skinner and J. N. K. Rao(1996), “Estimation in Dual Frame Surveys with Complex Designs”, Journal of the American Statistical Association, 91, 349-356. • Sharon L. Lohr and J.N.K. Rao(2000), “Inference from Dual Frame Surveys”, Journal of the American Statistical Association, 95, 2710280. • Sharon L. Lohr and J. N. K. Rao(2006), “Estimation in Multiple-Frame Surveys”, Journal of the American Statistical Association (under revision). • J. Lessler and W. Kalsbeek (1992), Non-sampling Error in Surveys, John Wiley & Sons, Inc.