150 likes | 298 Views
Expression language (EL) in Eurostat. SDMX - TWG Luxembourg, 5 Jun 2013 Adam Wroński. Policy context.
E N D
Expression language (EL) in Eurostat SDMX - TWG Luxembourg, 5 Jun 2013 Adam Wroński
Policy context The initiative is based on the Communication from the Commission to the European Parliament and the Council on the production method of EU statistics (Document ESSC 2010/05/6). It calls for: • more harmonisation and standardisation of statistical methodologies for data validation within the ESS • harmonising the IT infrastructure and sharing IT tools as a way to facilitate the use of agreed statistical methods, leading to better quality and higher productivity in the processing of statistical data
1.1 EDIT – standard data editing tool • Available in a stand-alone version, a server version and a freely accessible Web version 1-way SSL and ECAS protected • The Web version allows editing statistical data by anyone registered without any software installation • The tool relies on a EL capable of expressing complex rules
1.2 EDIT internal expression language • Custom EL designed specifically for data editing • Attempt to be as simple as possible and still enoughflexible to fit the requirements of any known / analyzed domain • The programs describe the rules and are composed of a set of steps with inputs and outputs • Programs difficult to be written by non-programmers
1.3 EDIT use and future • Generic tool today offers more features than in the past. Some MSs preparing, testing or using Edit for their data • In some data collections the stakeholders, anticipating the opportunity, have started organising a common data checking procedure (incl. CVTS, SBS, BOP, FDI and ITS) are beginning to use agreed set of rules • ESS.VIP on Validation will give direction for the future
2.1 ITDG 2011 conclusions • A formal unambiguous EL was needed for encoding the rules so that they can be translated into specific data editing system rule syntaxes • Use of the generic software tools provided by Eurostat optional
3.1 ESS.VIP on Validation Ongoing project Main task of the project: organize and optimize data editing among MSs and Eurostat for ESS data collections. Main deliverable: set of standards incl. common ESS EL Statistician friendly user interface capable to generate rule sets in EL and standardised documentation of the rules understandable by business users
3.2 Presentation to ESSC – May 2012 • The ESSC generally supported the approach and the first steps of the planned project on broad principles and guidelines for a review of the data checking policy in the ESS • Eurostat would present a more detailed proposal on the project parameters to the ESSC meeting in 2013. • The project would focus on a common language and optional software
3.3 EL in Eurostat and the ESS • A formal unambiguous ELwas needed to allow rules encoding so that they can be translated into other ELs • For more efficient production chain with responsibilities clearly assigned to the different actors • Friendly to statisticians – if possible, the rules to be expressed in a human understandable way • To be able to treat both micro data and aggregate data; • To allow exchange of rules among ESS members
3.4 EL under development Information model: The simpler the information model – the more flexible the language Data model = bi-dimensional data sets consisting of rows and columns To allow working with all types of incoming files Operators/functions/calculations: Statistical needs oriented Act on data model objects = data sets Allow expression of logical operators and computations
3.4. Beyond the EL Formats description (e.g., variables) Process description Data set quality measure Rule parameters EL expressions used in "if", "then", "else", weight, text message Severity, code, name, message, descriptions Rule sets and execution e.g. conditional Rules storage (registry) Calculations can use the same language
Summary: Eurostat case for EL • Savings • Common understanding • Easier exchange • Standardisation
EL: Status • Draft - work on-going – in-depth analysis of the deliverable • A lot has been done but there is still distance to cover • Mapping to SDMX and because of simple underlying model likely to other standards • Related to EXL (starting point functionality) but augmented w/ ESS requirements • Issue: we aim to have first implementable version by the Fall 2013