190 likes | 200 Views
This report provides recommendations on improving estimation and availability of standard errors for European statistics, including variance estimation methods and available tools. It also discusses the dissemination of confidence intervals to assess survey data quality.
E N D
General Recommendations of the DIME Task Force on Accuracy WG on HBS, Luxembourg, 13 May 2011
Main objectives of the DIME Task Force Make recommendations on how to improve estimation and availability of standard errors of European statistics Recommend variance estimation methods suitable to different sampling designs and types of statistics and review available tools HBS Quality Report: the dissemination of confidence intervals together with survey estimates is a key step to assess survey data quality ICT LFS
Main objectives of the Task Force • Propose a standard formulation of precision requirements and advise on precision measures adequate to different types of statistics • Make recommendations on how to assess compliance to precision requirements LFS
Variance estimation HBS Quality Report: the basic assumption is that HBS indicators are liable to sampling errors only But, the aim is: incorporate all sources of variability • Sampling variability(sampling errors) • Variability from non-sampling errors: • Unit non-response(the original sample is reduced) • Usually viewed as an additional sampling phase (the second-phase sample is taken by Poisson sampling, Post-stratified sampling…) • Use of the net sample size instead of the gross sample size in the formulae for variance estimation
Variance estimation (2) • Item non-response(imputed values are not real values) • Adjusted analytical methods • Eurostat, 2002 – Variance estimation methods in the EU • Deville and Sarndal, 1994 – SEVANI etc. • Resampling methods • Bootstrap (Saigo, Shao and Sitter, 2001) • Adjusted Jackknife method (Rao and Shao, 1992) etc. • Multiple imputation framework • Eurostat, 2002 – Variance estimation methods in the EU • Herzog and Rubin, 1985 etc.
Variance estimation (3) • Calibration - Use e.g. resampling methods with adjusted final weights - The variance of a calibrated estimator is (asymptotically) equal to that of the estimator based on the non-calibrated weights, but where the study variable has been replaced by the residuals of the regression on the calibration variables (Deville and Särndal, 1992). Etc.
Methods for variance estimation • Methods should take account of the sampling design (stratification, clustering…) and the type of statistics • Analytic, Linearization, Resampling methods • The choice is guided by a matrix developed by the DIME TF • Suitable methods • Unsuitable methods • References
Methods for variance estimation (2) • Consistency (the first is the best): 1. Balanced Repeated Replication and Bootstrap 2. Jackknife 3. Linearization. • Stability(the first is the best): 1. Linearization 2. Jackknife 3. Balanced Repeated Replication and Bootstrap.
Tools for variance estimation • Reviewed existing tools from the point of view of: • Their adequacy to different sampling designs; • Their capacity to take into account the effect of: • Implicit stratification • Rotational sampling designs • Unit non-response • Item non-response (imputation) • Calibration etc.
Tools for variance estimation (2) • No ideal/single tool; several tools with different characteristics. • POULPE (INSEE): • sophisticated, strong theoretical background, accounts for calibration (unlike most other tools) • needs expertise, does not account for the variability due to imputation • SEVANI (Statistics Canada) • multi-phase framework • incorporates effects of non-response and imputation • ....
Approaches for the estimation of standard errors of statistics at European level
Decentralized approach Usual approach for ESS surveys NSIs transmit to Eurostat standard errors for the national estimates, for a limited set of indicators/breakdowns Eurostat estimates the standard errors for the European estimates, for the same set of indicators and for the European breakdowns It satisfies the requirements of a standard delivery of an aggregated table Low flexibility: it does not meet the needs of standard errors for unforeseen/extra data aggregations/indicators Huge burden on NSIs if required to send standard errors for all indicators/breakdowns Reduced comparability of standard errors (different methods and tools)
Integrated approach(burden shared) 1. Use of one resampling method by NSIs and Eurostat • Information needed for the transmission of the microdata: • the sets of replicate weights • the full sample weights. • Eurostat calculates the replicate estimates and the variance • Flexibility to estimate variance for unforeseen data aggregations • High comparability of standard errors (one method) • Need of changes of methods in NSIs (-> burden) Guidelines and training needed for NSIs and Eurostat
Integrated approach (2) 2. Use of generalized variance functions by NSIs and Eurostat • They express a relationship between the variance estimates and the point estimates • Parameters determined by NSIs by using analytic or resampling methods and sent to Eurostat • No direct computation of variance, quick method to estimate variance for hundreds/thousands of indicators/breakdowns • Very easy for users • No need of microdata • Specific to sampling designs, types of statistics, population domains • Mainly empirical • Reduced comparability of standard errors (different methods and tools)
Fully centralized approach Use of one resampling method by Eurostat e.g. Jackknife in EU-SILC with SAS programmes • Information needed for the transmission of the microdata: • the stratum to which the ultimate sampling unit belongs; • the cluster to which the ultimate sampling unit belongs; • the final weight of the units used in the estimation (adjusted for non-response and calibration). • Eurostat calculates the replicate weights, the replicate estimates and the variance • Flexibility to estimate variance for unforeseen/extra data aggregations • High comparability of standard errors (one method) • (Heavy) burden on Eurostat
For the NSIs that send the microdata files, generalized variance functions can be applied by Eurostat (HBS,G. Osier) NSIs transmit to Eurostat the necessary info at record level Then Eurostat adjusts a variance function to ‘direct variance estimates’ on the basis of a reduced set of HBS indicators and uses it to estimate variance for all relevant HBS indicators. This streamlines the procedure, considering: - the time burden (especially with re-sampling methods) the high number of possible HBS indicators (one per COICOP category + all the socio-demographic breakdowns)
Assessment of compliance to precision requirements • On the basis of info from quality reports A detailed metadata template was developed by the DIME TF - Aim: obtain structured & detailed info on the sampling design, variance estimation methods and tools • Are the methods appropriate for the sampling design and the type of indicator? Are there systematic deviations? • Do the variance estimates incorporate the variability due to correction for non-response, imputation, calibration etc.? - Useful in a decentralized approach but also for the other approaches
Assessment of compliance to requirements (2) • In an integrated and fully centralized approaches • A common used resampling method facilitates the compliance assessment Principles for the compliance assessment: • Transparencyon the procedures; • Tolerance Why? • For the decentralized approach, the results produced with different methods might not perfectly match • What can be computed is not the "true" standard error for a given estimate, but only an estimated standard error, which in turn has its own variance • etc.