100 likes | 116 Views
This document provides an overview of the roles and deliverables of the Swiss Federal Statistic Office's Census project, including requirements definition, software acceptance, system integration, data processing applications, and data analysis system development. It also discusses the technical architecture and elements of data processing, as well as an overview of data analysis.
E N D
Census Technology:Processing architecture and data analysis Nicki Thomas Spöcker Swiss Federal Statistic Office Espace de l‘Europe 10 2010 Neuchatel Luzern, 01.02.2001
Table of Contents • TOC • SFSO - Roles and deliverables • Service center: overview • Service center: data processing • Data processing: elements • Data processing: technical architecture • Data analysis: brief overview • Data analysis: elements • Next presentation
Swiss Federal Statistic Office Roles and deliverables within the Census project • requirements definition • software acceptance • system integration • development of various data processing applications (in cooperation with interact) • development of data analysis system • user support/consulting, progress and quality control
plausibilities coding Def. household formation Link persons to households Population Other statistical data Data Processing System e-census System Call Center Hotline & Resend Checkbacks Key from Call Service center: Overview System Mail Management Data Capturing System Scanning Recognition Controlling Correction Controlling Correction Controlling Correction ManualKeying Transfer in database Database
plausibilities coding Def. household formation Link persons to households Other statistical data Data Processing System Service center: data processing • Server-automatic procedures: batch processing to perform plausibility tests, prepare and load data for interfacing subsystems, do automatic coding and prepare unfilled orders for the manual (GUI) applications • Client-manual applications: use form image to correct recognition errors, link census data (eg household formation), manual coding and many more. • Server logic programmed with Oracle (stored procedures) • Client application programmed in C++ Database
Data processing: elements • Plaus1 - data and image transfer data capturing system - data processing system • Plaus2 - data transfer temporary tables - working tables • Plaus3 - plausibility tests and generation of records in the error table • Plaus0 - Ecensus plausibility process • Plaus5/Plaus8 - call center interface (checkbacks, key from call) • DDS - interface with mail management, internet and call center • DHH - definitive household formation • V2 - link households and residents • BURV - link census data with registry of enterprises • PROCODE - coding of professions • WSA - handling people with multiple residents • MK - correct recognition with form image • data cleaning - inputation of missing values and data correction processes • progress/quality control
Data processing: technical architecture DB Server: Database engine. Execution of stored procedures and transaction handling. Automatic/batch processing. Application Server: Code engine. Provides processing power for various data processing applications. Load balancing implemented to distribute the workload between 10 PC workstations. Application Server Clients: PC workstations with locally installed GUI applications (C++ Clients). Manual processing. Clients Multi Tier Architecture (Unix, NT, Corba, Oracle, C++). Distributed computing. DB Server LAN
Online: Query, cross tabulation, analysis tools Data Mart Internet: Query, Ordering Offline: batch processing, preparation of statistical products. CD, Excel, maps, barcharts, diagrams, graphic representation of statistical results. Security: according to swiss federal laws and rules Data analysis: brief overview DB data processing Bridge DB other
Data analysis: elements • DB data processing: census primary entities like buildings, persons, households on the attribute level (micro data) • Bridge: SFSO metadata and classification database • DB other: external data sources • Data mart: microdata, aggregated datasets (cubes, macrodata) prepared and optimized for data analysis. Contains only error-free, non-redundant data. • Online: the experts use sophisticated tools for complicated data analysis tasks • Offline: batch and automatic processes cover performance and resource intensive tasks. • Internet: public services to provide access to statistical results and products.
Following presentation: Mr. Hans Peter Stamm - „Ecensus“: concept and presentation of realisation