270 likes | 430 Views
A selective editing method considering both suspicion and potential impact, developed and applied to the Swedish foreign trade statistics Topic (ii), WP 12. Anders Jäder and Anders Norberg, Statistics Sweden. The data. Main variables collected monthly: Commodity code (8-digit CN codes)
E N D
A selective editing method considering both suspicion and potential impact, developed and applied to the Swedish foreign trade statisticsTopic (ii), WP 12 Anders Jäder and Anders Norberg, Statistics Sweden
The data Main variables collected monthly: Commodity code (8-digit CN codes) Country of dispatch/arrival Quantity (weight and supplementary unit) Invoiced Value 350 000 observations per month
Score function Computed as a weighted geometric mean of measures of Suspicion and Potential impact
Selective editing The 1,500 observations with the highest scores are flagged
Suspicion The difference between Unit priceand the lower/upper quartile, divided by inter-quartiles distance. Logarithmic scale (Euro/Kg)
Potential Impact The difference between Invoiced Value and the median of Unit price multiplied by Quantity(Euro)
Hit rate=46% Impact=65%
Hit rate=30% Impact=80%
Potential impact The 8-digit commodity codes can be aggregated to 6, 4 and 2-digit commodity codes (CN6, CN4, CN2) and other classifications , e.g. the SITC classification. Over 10,000 estimates to be computed
Potential impact We have developed a formula with which the impact of an error on the statistics on all aggregation levels and sizes of estimates can be expressed in one single variable.
Potential impact Excel demonstration
Strategy • SCB has saved raw and corrected data for all months since 2000. We analyzed them • New system with parameters • Produce monthly process data for a continuous search of best parameter values Will we be misled when we analyze data that has been flagged by the old method ???
Study • We need many months of historical data – current data is not enough • Homogenous groups – modest demand on number of observations • Computation of median and quartiles weighted by Quantity • Suspicion versus probability of error – transformation of Suspicion
Suspicion versus probability of error Suspicion
Experiences from production Hit rate by variable:
Experiences from production Impact by variable:
Experiences from production - Impact on variable invoiced value: