1 / 26

Conceptual Modelling of Administrative Register Information and XML - Taxation Metadata as an Example

This article discusses the background and challenges of compiling administrative data, with a focus on taxation. It explores the data semantics of register data and proposes a taxation metadata definition. The article also presents some practical steps for the future.

kdickinson
Download Presentation

Conceptual Modelling of Administrative Register Information and XML - Taxation Metadata as an Example

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CONCEPTUAL MODELLING OF ADMINISTRATIVE REGISTER INFORMATION AND XML - TAXATION METADATA AS AN EXAMPLE Heikki Rouhuvirta, Statistical Methodology R&D heikki.rouhuvirta@stat.fi Ottawa, 16-18 May 2005

  2. Contents • Background • The Challenge • Primary Questions • Test Case – Finnish Taxation • Data Semantics of Register Data • Taxation Metadata Definition • Some Results • The Future • Some Practical Steps on the Way Heikki Rouhuvirta

  3. Background • Present state of compilation of administrative data • as the challenge • CoSSI • as the methodological framework for data semantics of registers • Codacmos • as the organizational base for concept testing Heikki Rouhuvirta

  4. Present state of compilation of administrative data Statistical Information Administrative Data Source Handbook Of Taxation etc. Data Source (e.g. RDB) data tailor-made programs gathering or ETL products Operational systems or Data Warehouses (e.g. SQL) (e.g. Informatica, Oracle) Data Source transmission Statistical Application file (sequential/ Flat File) statistician Data Data communication Store physical media FTP (+ VPN) network (CDROM, magnetic (Flat File) (internet, WAN) tape) Destination NSI Data Combining Data Store transmission data data file extraction/ Relational DB (sequential/ gathering transformation/ Flat File) (e.g. SQL) Data loading Store Data Store tailor-made programs Relational DB or ETL products (e.g. Informatica, Oracle) Statistical Register Data Survey Data Heikki Rouhuvirta

  5. CoSSI • Common Structure of Statistical Information – CoSSI • covers different ways of statistical data organization (statistical data matrix and statistical table) • includes a model to define contentual information in statistics • Includes a model to define the methodology used in statistics (e.g. measuring and classification) • manages the complexity of statistical information (e.g. nested variables structure) • includes definitions for all types of the statistical information, data, metadata for files, statistical metadata, quality declarations, charts • the main objective was to organise statistical data so that they also contain statistical metadata (describing both the structure and logic of statistical metadata at the same time) • Definition Descriptions available on the web at: http://www.stat.fi/org/tut/dthemes/drafts/cossi_definition_descriptions_v_09_2003.pdf • Statistical metadata see also from the web: http://www.stat.fi/org/tut/dthemes/papers/alternative_approach_to_metadata_codacmos_2004.pdf Heikki Rouhuvirta

  6. Codacmos • Cluster of Data Collection Integration & Metadata Systems for Official Statistics • EU Project 2003- 2004 (IST-2001-38636) • Consortium: • Italian National Statistical Institute, Statistics Finland, University Of Edinburgh, National Statistical Service of Greece, DESAN Research Solutions, Statistical Division Of Municipality Of Milan, The Finnish Tax Administration, University Of Patras, Institute Of Informatics And Statistics, University Of Athens, National Social Security Institute, Tietokarhu Ltd, Statistics Norway • http://www.codacmos.eu.org • TAXATION METADATA Partners: Statistics Finland, The Finnish Tax Administration and Tietokarhu Ltd Heikki Rouhuvirta

  7. The Challenge: how the present process, where the description of administrative data can mostly be read from the authorities' administrative handbooks, can be transformed into such that it meets the requirements for the usability and presence of the contentual description of data both in the production process to statistics producers and in the distribution of statistical information to users of statistics. Heikki Rouhuvirta

  8. Primary Questions • what are the metadata of administrative data? • how to process the metadata specifying the interpretation and use of administrative data collection and register data? • how to combine the original data description (e.g. concept definitions of register fields) to variable description and measurement information of statistics? • can accumulating interpretive metadata be “transported” in processing of information and if can, how? Heikki Rouhuvirta

  9. Test Case – Finnish Taxation (Finnish taxation on the web at: http://www.vero.fi) Heikki Rouhuvirta

  10. Taxation: Types and Sources of income Heikki Rouhuvirta

  11. Income tax deductions Heikki Rouhuvirta

  12. Data Semantics of Register Data • Modelling methodology: • starting point is to distinguish between • substance concept model and • information model whereby the concepts are described • Information organizing method: • any which doesn't lose information • Technology: • any without restrictions • Result: • Taxation metadata definition (taxmeta.dtd) Heikki Rouhuvirta

  13. Basic Substance Concept Tax type:i.e. Personal taxation Type of income:i.e. earned income, capital income A) Income:i.e. salary, pension Type of tax deduction B) Deduction Heikki Rouhuvirta

  14. Description Information Income:i.e. salary, pension Deduction Law:reference to a section of law 1) Law case:reference to a law case Formula:How the tax is calculated 2) Internal instruction:Instruction on spesific income and deduction area 3) Heikki Rouhuvirta

  15. Taxation Metadata Definition (taxmeta.dtd) Available on the web at: http://www.stat.fi/org/tut/dthemes/drafts/taxmeta_dtd_v_01.txt Heikki Rouhuvirta

  16. Taxation Metadata -Logical Concept Model (I) Heikki Rouhuvirta

  17. Taxation Metadata -Logical Concept Model (II) Heikki Rouhuvirta

  18. … result from register standpoint Demonstration Report is available on the web at:http://www.stat.fi/org/tut/dthemes/papers/ demoreport_on_taxation_metadata_codacmos_2004.pdf Heikki Rouhuvirta

  19. Taxation register view Taxpayer’s tax register record Plain-language code (derived or column name) Metadata Value in euro Tax type code used in the register Structure view Metadata view Heikki Rouhuvirta

  20. … and result from statistics standpoint Heikki Rouhuvirta

  21. Income distribution statistics – statistical metadata Heikki Rouhuvirta

  22. Income distribution statistics – taxation register metadata (I) statistical metadata register metadata Heikki Rouhuvirta

  23. Income distribution statistics – taxation register metadata (II) statistical metadata register metadata Heikki Rouhuvirta

  24. The Future • Could it be …. • integrated register metadata • a genuinely metadata-driven statistical production process • rich metadata is present and available in all production stages, including editing as well as transforming of register concepts to statistical concepts • metadata accumulates as the process advances without losing old metadata • rich metadata is also available for users during the dissemination process of statistical information Heikki Rouhuvirta

  25. x x … x … x 11 12 1j 1p x x … x … x 21 22 2j 2p . . . . . . . . x x … x … x i1 i2 ij ip . . . . . . . . x x … x … x n1 n2 nj np Variable x x … x … x 1 2 j p . Statistical unit . a i . . n a XML based metadata-driven statistical production collection routines transaction based data storage RDB Hand- book of Register XMLDB units based data report with meta Register Metadata (xml) 1° aggregation Questionnaires (xml) data gathering data transmission xml based production system data combining statistical metadata based on CoSSI units and variable based data organisation combined data collected data matrix based on CoSSI checked values new metadata data editing new variables conceptual formation Heikki Rouhuvirta

  26. Some Practical Steps on the Way • Plan to implement this scheme of things to metadata of other registers (e.g. population register) • Integration of structured statistical metadata system with statistical software packages (e.g. SAS, SuperStar) for simultaneous use Heikki Rouhuvirta

More Related