120 likes | 129 Views
This article discusses the development and implementation of selective data editing at SCB (Statistics Sweden). It explores the potential gains of this method and the use of a common tool. The results from case studies are presented, including the SUSPICION and POTENTIAL IMPACT measures, local and global scoring, and the process of implementing Selekt.
E N D
Selective data editingDevelopment & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)
Standardizationat SCB • Decentralized production • Development of CBM:s • Editing costly, 33% of budgets • Data collection departments, 2006 • Standardization – the Lotta project, in 2006 2
Nine case studies • Purpose of the project: • Try using selective data editing • What is the potential gain using the method? • Would it be possible to develop and use a common tool?
SUSPICION • SUSP(j, k) = Suspicion of variable j for unit k • SUSP(j, k) = 0 if variable value falls within acceptance interval • SUSP(j, k) → 1 as value deviates from acceptance limit • 0 ≤ SUSP(j,k) ≤ 1
POTENTIAL IMPACT • POTIMP = Potential impact • POTIMP is weighted absolute difference between observed and predicted value : • POTIMP(j ,k,d) = for variable j, unit k in domain d wk is sampling weight, k(d) is domain indicator • SELEKT supports several ways to establish predicted value: from time series data and from cross sectional analysis within homogenous groups of units
Flagging suspected errors log(Potentialimpact) Flagged log(Suspicion) 20
LOCAL SCORE Local (item) score LScore (j,k,d): LScore (j,k,d) = SUSP(j,k)*|POTIMP(j,k,d)|*Cello(j,d) Cello(j,d) is inversely proportional to the standard error based on previous data
GLOBAL SCORE • Global (unit) score GScore(k) is obtained by aggregation of local scores • LScore (k, j, d) → LScore (k , j) → GScore(k) • → = Summation , Euclidian Summation or Maximum • Only those units with GScore larger than a pre-decided threshold are followed up
Implementation of selekt So far three surveys: • Business activity indicators • Wage & salary structures in the private sector • Commodity flow survey 11
Documentation A General Methodology for Selective Data Editing jorgen.svensson@scb.se anders.norberg@scb.se 12