1 / 31

On the use of auxiliary variables in agricultural surveys design

On the use of auxiliary variables in agricultural surveys design. Federica Piersimoni ISTAT - Italian National Institute of Statistics Roberto Benedetti University “G.d’Annunzio” of Chieti-Pescara, Italy Giuseppe Espa Universy of Trento, Italy. Actual situation Proposal Estimators

rosaliet
Download Presentation

On the use of auxiliary variables in agricultural surveys design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On the use of auxiliary variables in agricultural surveys design Federica Piersimoni ISTAT - Italian National Institute of Statistics Roberto Benedetti University “G.d’Annunzio” of Chieti-Pescara, Italy Giuseppe Espa Universy of Trento, Italy

  2. Actual situation Proposal • Estimators • Sampling designs Data description Simulation Analysis of the results Conclusions Contents

  3. Actual situation Sample units Population units

  4. in sample surveys

  5. 2001 scatter plot matrix tc1= cattle slaughterings 2001 tc2= sheep and goats slaughterings 2001 tc3= pigs slaughterings 2001 tc4= equines slaughterings 2001

  6. 2000 scatter plot matrix tc10= cattle slaughterings 2000 tc20= sheep and goats slaughterings 2000 tc30= pigs slaughterings 2000 tc40= equines slaughterings 2000

  7. 1999 scatter plot matrix tc19= cattle slaughterings 1999 tc29= sheep and goats slaughterings 1999 tc39= pigs slaughterings 1999 tc49= equines slaughterings 1999

  8. SCATTER PLOTS tc1: cattle slaughterings 2001 tc2: sheep and goats slaughterings 2001 tc3: pigs slaughterings 2001 tc4: equines slaughterings 2001 tc10: cattle slaughterings 2000 tc20: sheep and goats slaughterings 2000 tc30: pigs slaughterings 2000 tc40: equines slaughterings 2000 tc19: cattle slaughterings 1999 tc29: sheep and goats slaughterings 1999 tc39: pigs slaughterings 1999 tc49: equines slaughterings 1999

  9. Year 2001 Year 2000 Year 1999

  10. Sampling frame: N = 2.211 units (enterprises) and 12 variables: number of: • cattle, • pigs, • sheep and goats, • equines slaughtered at the census surveys of 1999, 2000 e 2001. 

  11. 2000 samples of size n = 200… …using as auxiliary information the complete frame at 1999 and at 2000 to obtain estimates at 2001! Estimates obtained through the HorvitzThompson expansion estimator and the calibration estimator (PV) by Deville and Särndal (1992): Vector of the totals of the auxiliary variables Distance function

  12. Samples selection • simple random sampling (SRS) • stratified sampling (ST) • ranked set sampling (RSS) • probability proportional to size (PS) • balanced sampling • PS + balanced sampling

  13. SRS:direct estimate doesn’t use auxiliary information ST: auxiliary information is used ex ante the strata setting up; five planned strata; multivariate allocation model by Bethel (1989).

  14. RSS: original formulation: • Selection SRS without reinsertion of a first sample of n units; • Ranking in increasing order of the n units of the sample with respect to an auxiliary variable x known for every population unit; • The interest variable y is measured on the first unit only; • A second SRS is drawn and ranked; • The interest variable y is measured on the second unit only; • ….and so on till n replications.

  15. Ranking variable: with k =1,…,N, i =1,4 and t=1999, 2000. For the units k: 

  16. PS: If y positive auxiliary variable x selection with probability x. Such ex ante probability is

  17. BALANCED SAMPLING and PS + BALANCED SAMPLING: The balance constraint has been imposed for the four variables to be estimated. The difference between the two criteria: in the second case the constraint is imposed ex post to PS samples

  18. TOTAL

  19. Conclusions It is better to impose the balance constraints in design phase, than in ex post (cf. RMSE SRS - RMSE BAL) Best performances: balanced PS selections and PS with calibration a joint use of complex estimators together with efficient sampling designs may reduce considerably the variability of the estimates but…...

  20. but…... PS and PS with calibration selection criteria less robust of the others when outliers are present more efficient bad performance of RSS method forced univariate use of the auxiliary information for the ranking setting up when linear independence is present

  21. Simulated sampling distribution of the tc2 estimates in the case of pps, with calibration estimator based on auxiliary variables of 2000 TRUE VALUE

  22. Simulated sampling distribution of the tc3 estimates in the case of pps, with calibration estimator based on auxiliary variables of 1999 TRUE VALUE

  23. Simulated sampling distribution of the tc4 direct estimates in the case of balanced pps, based on auxiliary variables of 1999 TRUE VALUE

  24. Simulated sampling distribution of the tc2 direct estimates in the case of balanced pps, based on auxiliary variables of 2000 TRUE VALUE

  25. References AlSaleh M.F., AlOmari A.I. (2002) Multistage ranked set sampling, Journal of Statistical Planning and Inference, 102, 273286. Bai Z., Chen Z. (2003) On the theory of rankedset sampling and its ramifications, Journal of Statistical Planning and Inference, 109, 8199. Bethel J. (1989) Sample allocation in multivariate surveys, Survey Methodology, 15, 4757. Deville J.C., Särndal C.E. (1992) Calibration Estimators in Survey Sampling, Journal of the American Statistical Association, 87, 418, 376382. Dorfman A.H., Valliant R. (2000) Stratification by size revised, Journal of Official Statistics, 16, 2, 139154. Espa G., Benedetti R., Piersimoni F. (2001) Prospettive e soluzioni per il data editing nelle rilevazioni in agricoltura, Statistica Applicata, 13, 4, 363391. Hidiroglou M.A. (1986) The construction of a self-representing stratum of large units in survey design, The American Statistician, 40, 1, 2731. Li D., Sinha B.K., Perron F. (1999) Random selection in ranked set sampling and its applications, Journal of Statistical Planning and Inference, 76, 185201. McIntyre G.A. (1952) A method for unbiased selective sampling, using ranked set, The Australian Journal of Agricultural and Resource Economics, 3, 385390. Patil G.P., Sinha A.K., Taillie C. (1994a) Ranked set sampling, in G.P. Patil and C.R. Rao (eds) Handbook of Statistics, Volume 12, Environmental Statistics, North Holland Elsevier, New York, 167–200. Patil G.P., Sinha A.K., Taillie C. (1994b) Ranked set sampling for multiple characteristics, International Journal of Ecology and Environmental Sciences, 20, 94–109. Ridout M.S. (2003) On ranked set sampling for multiple characteristics, Environmental and Ecological Statistics, 10, 255–262. Rosén B. (1997) On sampling with probability proportional to size, Journal of Statistical Planning and Inference, 62, 159191. Royall R.M. (1970) On finite population sampling theory under certain linear regression models, Biometrika, 57, 2, 377387. Royall R.M. (1992) Robustness and optimal design under prediction models for finite populations, Survey Methodology, 18, 179185. Royall R.M., Herson J. (1973a) Robust estimation in finite populations I, Journal of the American Statistical Association, 68, 344, 880889. Royall R.M., Herson J. (1973b) Robust estimation in finite population II: stratification on a size variable, Journal of the American Statistical Association, 68, 344, 890893. Särndal C-E, Swensson B., Wretman J. (1992) Model Assisted Survey Sampling, Springer Verlag, New York.

  26. THANK YOU FOR YOUR ATTENTION!

More Related