1 / 39

A Resampling Study of NASS Survey MPPS Sampling Strategy

A Resampling Study of NASS Survey MPPS Sampling Strategy. By Stanley Weng National Agricultural Statistics Service U.S. Department of Agriculture. INTRODUCTION. MPPS Multivariate Probability Proportional to Size Address multiple, and often competing, purposes (multi targets) of a survey

xaria
Download Presentation

A Resampling Study of NASS Survey MPPS Sampling Strategy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Resampling Study of NASS Survey MPPS Sampling Strategy By Stanley Weng National Agricultural Statistics Service U.S. Department of Agriculture

  2. INTRODUCTION MPPS • Multivariate Probability Proportional to Size • Address multiple, and often competing, purposes (multi targets) of a survey • Used for NASS Crops Survey (CS) etc., since 1999

  3. MPPS • Technically Sample was selected using a Poisson method. Each farm i had a unique probability of selection, formed by

  4. MPPS where is the item m selection probability, determined by ▪ auxiliary data with the assumption of the variance proportional to (a power of) the auxiliary variable value ▪ optimal allocation ▪ a desired item-level sample size

  5. MPPS • Development and application of the MPPS strategy at NASS: Amrhein, Hicks and Kott (1996) Amrhein and Bailey (1998) Bailey and Kott (1997) Hicks, Amrhein and Kott (1996) Kott, Amrhein and Hicks (1998).

  6. A COMPARISON STUDY • This study was designed to compare MPPS with the previously used SRS ((Stratified) Simple Random Sampling) strategy

  7. THIS STUDY • Explored the resampling approach to reveal the statistical characteristics/ behavior of NASS Ag survey data • Raised issues for further investigation to improve our understanding and practice of NASS Ag survey sampling /estimation

  8. RESAMPLING ●Population bootstrap • Base sample June Crop Survey MPPS samples • Pseudo population Composed of replicates of base sample elements, according to the (integerized) weight of the element

  9. RESAMPLING • Resamples Independent samples, drawn from by Poisson and SRS sampling strategies respectively

  10. RESAMPLING ● Resample totals, and

  11. RESAMPLING • Resampling variance estimatefor the sample total estimate Bootstrap statistic

  12. DATA • The crop component of the 2004 and 2005 June QAS, for all 42 participating states • Certainty elements were eliminated from sample, to avoid unnecessary complication ,

  13. RESAMPLING VAR ESTIMATES ● Based on 1000 resamples • Naive Comparison ●Log-Log Plot ▪Resampling variance est vs sample total across crops – for each state ▪Overlay: Poisson (*) vs SRS (^)

  14. Naive Comparison • General linear trend (Assumption: the variance proportional to a power of the total) • For majority of crops, SRS variance appeared greater than Poisson variance (but often not appreciably)

  15. Log-Log Plot of Resampling Variance Est vs Total Across Crops: CAOverlay: Poisson (*) vs SRS (^)pot srg saf sun dwh bar ctp oat ohy ctu wwh ric crn alf

  16. Validness of the Comparison • Need additional information to justify • The quality of the resampling variance estimate depends on the statistical quality of the resample totals, which also provides evidence for the appropriateness of the sampling strategy • Among various aspects, the most important: NORMALITY

  17. Normality ●Q – Q plot ofresample totals • Demonstration: CA ▪ Most crops: Good shape of Q-Q Plot (Corn, Potatoes) ▪ Exception: Other Hay Evidence that Poisson was better than SRS

  18. Outliers on the log-log plot • Located far apart from the general trend • The two sampling strategies gave appreciably different estimates • Demonstration: ▪ CA: Other Hay ▪ MT: Potatoes Evidence that SRS was better

  19. Log-Log Plot of Resampling Variance Est vs Total Across Crops: MTOverlay: Poisson (*) vs SRS (^) mus sun can pot saf fla crn oat ohy dwh bar alf wwh swh

  20. FINITE SAMPLE RESAMPLING Complexities - Due to the special features of survey sampling ● Nonindependence arising in sampling without replacement ● Other complexities of finite population structure by designs and estimators

  21. FINITE SAMPLE RESAMPLING Effects of discreteness (Davison & Hinkley, 1997, 2.3.2) ▪ Discrete empirical distribution and in particular, ▪ In finite population sampling, the pseudo population formed by replicates of sample elements

  22. FINITE SAMPLE RESAMPLING Issues with this study • Comparable sample size - Addressed by size adjustment • Impact of the base sample - Not clear

  23. Impact of Base Sample For finite population resampling, the general guideline ▪ The resampling population mimics the original population, and ▪ The resamples, mimic the base sample, drawn from by a design identical to the one by which the base sample was originally drawn (Sarndal, et al., 1992, Ch. 11)

  24. AT ISSUE ●How the resampling technique should be correctly modified to accommodate the finite sampling situation?

  25. AT ISSUE ●In literature, most reported finite sample resampling studies used (stratified) SRS, which bears the most similarity to the infinite population independent random sampling - the standard setting that the resampling technique is based on

  26. SUMMARY • An Approach Resampling & analysis of resamples, using statistical graphical and diagnostic techniques, to reveal statistical characteristics / behavior of NASS Ag survey data

  27. SUMMARY ●Sampling strategy comparison ▪ Poisson seemed to be preferable to stratified simple random sampling ▪ A national comparison table of the two strategies across crops and states is to be produced for a comprehensive picture with likely causal factors identified

  28. FURTHER INVESTIGATION To develop statistical understanding, the resampling setting of this study and other statistical information techniques will be further explored

  29. FURTHER INVESTIGATION ▪ Behavior of Studentized bootstrap statistics ▪ Statistical function (Booth, Butler, and Hall, 1994; Davison & Hinkley, 1997) ▪ Examine different survey data

  30. THANK YOU

  31. ALFAlfalfa All Harvested Acres BARBarley All Planted Acres CANCanola All Planted Acres CRNCorn Planted Acres CTPPima Cotton Planted Acres CTUUpland Cotton Planted Acres DEBDry Beans Planted Acres DWH Durum Wheat Planted Acres FLAFlaxseed Planted Acres MUSMustard All Planted Acres OATOats All Planted Acres OHYOther Hay Harvested Acres PNTPeanuts All Planted Acres POTPotatoes All Planted Acres RICRice All Grain Planted Acres RYERye All Planted Acres SAFSafflower All Planted Acres SGBSugarcane All Planted Acres* SOYSoybeans All Planted Acres SPTSweet potatoes Planted Acres SRGSorghum All Planted Acres SUGSugarcane For Sugar Harvested Acres SUNSunflowers All Planted Acres SWHSpring Wheat Irr Planted Acres WWHWinter Wheat All Planted Acres

More Related