210 likes | 229 Views
This study tests the use of administrative data editing in the 2009 Agriculture Census in Spain. It utilizes a selective editing procedure to prioritize suspect units for manual review, aiming to reduce errors efficiently. By comparing data from previous censuses and surveys, thresholds are set based on pseudo-bias and re-contact rates to balance accuracy and efficiency in editing. The findings suggest that this approach can effectively identify units with discrepancies and guide follow-up actions.
E N D
Testing the use of administrative data to edit the 2009 Agriculture CensusDolores LorcaNational Statistical Institute of Spain
Summary • A selective editing procedure is applied to test the use of administrative data to edit the Agriculture Census • Study case: Using data from previous Agriculture Census and the 2005 Farm Structural Survey (FSS)
1. Introduction • The Spanish NSI carries out every 10 years an Agriculture Census • The large number of questionnaires collected by many interviewers during a short time about different kind of holdings can have quite errors to amend in editing
1. Introduction Different editing approaches are applied to the complex set of collected data: • In the data collection phase simple checks are applied using build-in edits in a CAPI system • Selective editing is applied to determine the units that will be manually reviewed • Automatic editing is applied to rest of units using DIA system • Macroediting
2. Selective editing procedure • Using administrative data for editing census data • Selective editing procedure: To determine and prioritise the suspect units to be manually reviewed
2. Selective editing procedure Using simple expansion estimators: Wi is the sample weight for the unit i n is the sample size Xi denotes the X variable value for the unit i
2. Selective editing procedure Local score function: Xi is the reported value is the administrative value wi=1 ( census data)
2. Selective editing procedure Scaled local score is the total estimate of variable X calculated using administrative data
2. Selective editing procedure Global score GSi=max (LSi )
2. Selective editing procedure Selection of the thresholds:the simulation study approach by Lawrence and Mackenzie (2000): • Absolute pseudo-bias of Latouche and Berthelot (1992) is the total estimate obtained by replacing all reported values with a global score larger than the pre-determined threshold (the percentile p) by their administrative values and leaving reported values in place for the others
2. Selective editing procedure • Recontact rate: the number of units with a global score larger than the pre-determined threshold (percentile p) divided by the total number of units To choose the thresholds trying to balance between a low pseudo-bias and a low re-contact rate
3. Study case • Selected variable: area of olive grove planted • 1999 Agriculture Census and administrative register: 253,038 units • 2005 FSS and administrative register: 5,804 units • Selection of thresholds by geographical area: • PBp p=95, 90, 85, 80, 75, 70 • Recontacts for each p
3. Study case • With a re-contact rate of 5% the reduction of the pseudo-bias is much greater than the rest of rates • At least, for most region, we would use the global score distribution percentile p=95 as threshold
4. Conclusions The pilot study shows that this selective editing approach could help us to prioritise follow-up actions for those units with significant discrepancies with administrative data