1 / 20

Dealing with Household Non-response using Generalised Calibration – a simulation

This study explores the use of generalised calibration for reducing non-response bias in household surveys, specifically when the non-response mechanism is non-ignorable. The simulation uses data from the Household Budget Survey in Luxembourg to test different weighting strategies, including the generalised calibration method. The results show the impact of different variables and calibration techniques on relative bias, variance, and mean square error.

cullum
Download Presentation

Dealing with Household Non-response using Generalised Calibration – a simulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Guillaume OsierInstitut National de la Statistique et des Etudes Economiques (STATEC)Social Statistics DivisionGuillaume.Osier@statec.etat.lu Dealing with Household Non-response using Generalised Calibration – a simulation Task Force on Victimization Eurostat, 14-15 February 2012

  2. The generalised calibration • We seek to determine new weights function of the design weights and a vector z of non-response variables: • F is a « calibration » function and z is a vector of non-response explanatory variables • The vector  is determined by solving the calibration quations based on a vector x of calibration variables: 2

  3. Objective of the simulation • To test out the generalised calibration as a method for reducing non-responsebias, particularlywhennon-responseisnon-ignorable, thatis, when the non-responsemechanismdependson variables of interest measured within the same survey and observed for only the responding units • Weused the Household Budget Survey (HBS) data for Luxembourg: the population comprisedall the householdswhichparticipated to the surveyat least once between2005 and 2010 nearly 8000 households

  4. S1* 1. Bias 2. Variance 3. Mean Square Error (MSE) S2* … SK* 3. Estimation of the targetparameter(meanconsumtionexpenditure per household) fromeachreplication 2. Selection of K=2000 replications according to a non-ignorable response mechanism (mimic HBS) 1. All the hhswhichparticipatedat least once to the HBS in 2005-2010 ( 8000)

  5. Whenestimating the targetparameter, this simulation offered an opportunity to test differentweightingstrategies, including the generalised calibration

  6. Construction of a non-ignorableresponsemechanism–1/2 • Non-responseisdescribed by a logistic model basedon: • Age, gender and citizenship of the household’sreferenceperson(observed for both the respondents and the non-respondents) • Log of the householdtotal expenditure(observed for only the respondents) • To estimate the model parameters, the explanatory variables must be known both for the respondents and the non-respondents  need to impute expenditure values to the non-respondents first

  7. Construction of a non-ignorableresponsemechanism–2/2 • Weassumeda log-normal distribution of household total expenditure, withdifferentmeanparametersm and m* for the respondents and the non-respondentsrespectively • m* > m on average, non-respondinghouseholds have higherexpenditurethanrespondinghouseholds • Thus, by using the log of the householdexpenditure in the model (observed for the respondents, imputed for the non-respondents), weconstruct a response model in which the higher the expenditure, the lower the probability

  8. Weighting strategies – 1/4 1. The classical strategies • Division by the global response rate calculated from the sample observations • One-step calibration using household size, age, gender and citizenship of the household’s reference person 2. Generalised Calibration • Non-response variables: household size, age, gender and citizenship of the household’s reference person + household total expenditure • Calibration variables: household size, age, gender and citizenship of the household’s reference person + one more variable

  9. Weightingstrategies–2/4 • In order to have the generalised calibration equation identifiable, weneed as many non-response variables as calibration variables • Thus, if weaddhouseholdexpenditure as a non-response variable, we must alsoadd one more calibration variable to the system • Wesimulated an additional calibration variable having a givencorrelationwithhouseholdexpenditure. As we’llseenext, the results of the simulation are stronglyaffected by the value of 

  10. Weightingstrategies – 3/4 • Several calibration variables werecreatedeachhaving a fixed in advancecorrelationwith the household total expenditure (denotedx):

  11. Weightingstrategies – 4/4 • Wealsotestedwhatwouldhappen if non-responsewas in factignorable, thatis, the non-responsemechanismdoesnotdepend on the variable of interest scenario in whichwewrongly assume non-ignorability • Finally, wetestedwhatwouldbe the result if wechanged the functionalform in the calibratedweights (functionF of the calibration equation): • Linear • Raking ratio • Logistic ( LO = 0.1 and UP = 30 )

  12. The macro Calmar2 • New version of the SAS macro Calmar developed by France’s Statistics Office (INSEE) to calculate calibrated weights to external data sources • Can implement very easily the generalised calibration technique: • The macro was used as a software tool to calculate the weights in the simulation • The generalised calibration is also implemented in the software R (package ‘sampling’, function ‘gencalib’)

  13. Results (Non-ignorable) – Relative Bias (%)

  14. Results (Non-ignorable) – Variance

  15. Results (Non-ignorable) – MSE

  16. Results (Ignorable) – Relative Bias (%)

  17. Results (Ignorable) – Variance

  18. Results (Ignorable) – MSE

  19. Conclusion –1/2 • According to this simulation, assumingthat non-responseis non-ignorable, the generalised calibration approach has ledtoward a substantialreduction of the non-responsebias • In the same time, the lower the correlationbetween the additional calibration variable and the household total expenditure, the higherthe sampling variance  « bounded » methods (logistic) betterbeused • When the biasissubstantiallyreduced, the Mean Square Error (MSE) isreduced as well, unless the loss in samplingprecisionistoo high (nearzero)

  20. Conclusion –2/2 • When we apply the generalised calibration to an ignorable non-response mechanism: • The bias appears to remain stable • The sampling variance increases as well as Mean Square Error (MSE) lower accuracy • Because of this, generalised calibration should be used only if we have strong convictions that non-response is caused by variables of interest of the survey, for which the values are only observed on the respondents. In the absence of any information on the non-respondents which could help check this assumption, we have no other choice but to « trust » it

More Related