170 likes | 595 Views
Open GSBPM compliant data processing system in Statistics Estonia (VAIS). 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing Systems Department Statistics Estonia 23 th. of May 2011. Strategy of Statistics Estonia 2008–2011.
E N D
Open GSBPM compliant data processing system in Statistics Estonia(VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing Systems Department Statistics Estonia 23th. of May 2011
Strategy of Statistics Estonia 2008–2011 “From data collector to information service provider” Objective: High-quality information service Standardise the process of data processing: Indicator: Introduction of the unified data processing software • Working out and introduction of the universal data processing information system Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Architecture of the information system Metadata system Economic entities iMETA KUNDE Persons Users eSTAT Data collection Processing Statistical analysis Dissemination VAIS PX-Web VVIS eGeostat Census-HUB ADAM Administrative registers Data Warehouse Statistical registers SRS Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Data processing system (VAIS) • VAIS is a collection of tools and technologies aimed at automating data processing (Phase 5 in GSBPM). • In essence, the task of check, clean, and transforming statistical activity data can be identified as taking the raw data from one or more sources and transforming it to analytical system source data input data base structures (observation registry). Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Framework for … • Integrate data • Classify & code • Review, validate and edit • Impute • Derive new variables & statistical units • Calculate weights • Calculate aggregates • Finalize data files Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Metadata driven template based tool Template driven approach provides an universal solution for three main goals of the VAIS project: • Create an easy to use statistical data processing tool requiring minimal programming skills for transformation package creation. • Create a metadata driven process-oriented and automated statistical data processing tool. • Create an extendable data transformation tool. Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Design Phase Data processingng package (XDTL) for Statistical Activity N Common XDTL Packages Common XDTL Packages Common Metadata Repository Data Sources for Statistical Activity N INTEGRATE DATA INTEGRATE DATA Validation Rules for Statistical Activity N VALIADTE VALIADTE Imputation Method for Statistical Activity N IMPUTE IMPUTE Aggregation Def for Statistical Activity N AGGREGATE AGGREGATE Data Sources for Statistical Activity N INTEGRATE DATA INTEGRATE DATA Target Dataset for Statistical Activity N LOAD DATA LOAD DATA Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Automating and speeding up data transformation • Raw data, transformation metadata • and source data audit trails • Metadata driven template • based tool • Balancing automation • and manual intervention Data processing with VAIS Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Balancing automation and manual intervention Manual data processing RAW data DataWarehouse Automated data processing OK? Metadata (validation and transformation rules) Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
VAIS applications and roles 2.04.2014 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
URMA User rights management application Allows using existing user for authorization Allows create roles and link users with roles Allows set rights according to domain statistical work 2.04.2014 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
VAIS Designer Application for data processing design User interfaces for designing each processing procedures Procedures group to packages Packages setup fallows policy of ETL Packages are designed for each statistical work version 2.04.2014 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
VAIS Operator Allows user to manually intervene to data processing. Allows to solve tasks created from data validation. Report of data processing gives overview of data in process. Gives users information for decision, that is necessary to solve tasks. 2.04.2014 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Technical platform 2.04.2014 VAIS is built on open-sourced freely available technological components. XDTL (eXtensible Data Transformation Language – an XML based descriptional language designed for specifying data transformations, see http://xdtl.org) run-time engine (XDTL RT). MMX Metadata Repository, part of Metadata Framework (a MOF compliant metadata management environment designed with a wide variety of metadata-driven applications in mind, see http://mmframework.org). Apache Foundation's Velocity template engine (http://velocity.apache.org) is used as the template engine combining excellent template rendering functionality with very easy to use template language. The user applications are programmed in Java, based on Wicket MVC framework (http://wicket.apache.org) Quartz scheduling framework (http://www.quartz-scheduler.org) is used for execution scheduling. Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Implementation • VAIS development 05.2010- 10.2011 • Data processing of Population and Housing Census 2011 (31.12.2011) • Reuse administrative data (2012) • Data collecting system for administrative data (ADAM) and eSTAT development for prefilling questionnaires in eSTAT with administrative data (annual bookkeeping report). (31.08.2011). VAIS is used for converting administrative data into the statistical data format. (for the year 2012 i.e for the reference year 2011 data collection) • Data processing of other statistical activities (first pilots 2013) • Data processing of next registry based Population and Housing Census (pilot 2014) Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Questions? Thank you!