130 likes | 283 Views
On Implementing CSPA Specifications for Editing and Imputation Services Donato Summa , Monica Scannapieco , Diego Zardetto , Istat , Italy Istituto Nazionale di Statistica – ISTAT. The CSPA concept.
E N D
On Implementing CSPA Specifications for Editing and Imputation Services Donato Summa, Monica Scannapieco, Diego Zardetto, Istat, Italy Istituto Nazionale di Statistica – ISTAT
The CSPA concept • National Statistical Institutes (NSIs) produce Official Statistics having very similar goals • Common activities carried on in an independent way, almost without relying on shared solutions • Statistical organizations have attempted many times to share their processes, methodologies and software solutions (significant work to integrate) Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The CSPA concept • As part of the modernization effort in the Official Statistics field, the High Level Group for the Modernization of Statistical Production and Services (HLG) has taken action in order to address these issues • promotion of development and implementation of the CSPA (Common Statistical Production Architecture) Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The CSPA concept CSPA provides a template architecturefor official statistics, describing: • What the official statistical industry wants to achieve • How the industry can achieve this, i.e. principles that guide how statistics are produced • What the industry will have to do, compliance with the CSPA Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The CSPA concept Tools editrules SCS CANCEIS Services CSPA compliant Platforms Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The CSPA concept Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The ErrorLocalization service • In the POC initiative of 2013 CSPA project Istat undertook the responsibility of developing the CSPA Error Localization service, with the roles of designer, builder and assembler • It was decided to wrap the “localizeErrors” function contained in the “editrules” R package developed at Statistics Netherlands Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The ErrorLocalization service • Data used for test cases come from Istat’s Structure of Earning Survey • Input unit data sets involve 20 variables • The rules set consists of 44 edits involving 17 numeric variables appearing in the unit data sets • 3 different test caseswith the same rules set Data set 3 Data set 2 Data set 1 3000 mixed records 2000 exact records 1000 erroneus records Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The ErrorLocalization service • The service was implemented technically as a Java standalone application (jar executable file) that wraps up the “localizeErrors” function of the “editrules” R package • The jar can be called by GUI or by command line and is responsible of: • Take input parameter from user (or application) • Invoke the execution of the R script in the R environment with provided input parameters • Return the output parameters (output file generation) Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The ErrorLocalization service • The Error Localization service wrapped by the Java program was then deployed on CORE thus proving the fully compatibility of CSPA services with respect to a specific NSI’s internal platform • CORE (COmmon Reference Environment) is the Istatinternal platform for statistical processes execution Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
CSPA Platform …. Service …. Tool …. Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
Conclusion • Istat is currently involved in the 2014 CSPA Implementation project, with the role of developing the Error Correction service. • the following activities are ongoing: • study how to extend such a service in order to perform a full editing and imputation process • design a CSPA specification, to be shared and agreed among CSPA implementation project participants • implement the specifications provided at by concrete CSPA services wrapping existing tools. Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
Thankyou for the attention ! Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014