330 likes | 337 Views
This overview explores the variables subsystem in statistical metadata, its relationship with other systems, search and management applications, statistical indicators, normalization and harmonization, and the benefits it provides.
E N D
Statistics Portugal/ Metadata Unit Monica Isfan(monica.isfan@ine.pt) « Variables Subsystem Joint UNECE/ EUROSTAT/ OECD Work Session on Statistical Metadata (METIS) 11 –13 March 2009
Overview • Variables subsystem • Relationship with other systems • Search and management applications • Statistical indicators • Normalization and harmonization • Benefits
Variables Subsystem ISO/ IEC 11179 + IDMB Statistics Canada • Statistical Survey Design • Automatic Questionnaire Generations • Statistical Dissemination • Facilitate Standardization • Identify Duplicates • Facilitate Data Sharing Variables Subsystem
Variable subsystem Production System Variables Subsystem Dissemination System
Variables Subsystem Family/ Theme Property Object Class Conceptual Variable Representation Class Value Domain Variable Unit of Measure Statistical Indicator
Variables Subsystem Variables Subsystem Statistical indicators Variables defined Property Object class (population or statistical unit) Representation class Value domain
Variables Subsystem Labour force questionnaire/ personnel data (persons - members of the household) Object class Property Representation class Valuedomain
Variables Subsystem Concept “Marital status”- 174 A person's legal situation consisting of the qualities defining his or her personal status in terms of family relations figuring in the register. It comprises the following situations: a) single, b) married, c) widow(er), d) divorced. Property Marital status Object Class Person No concept Representation class Code Value domain Enumerated (classification + level of classification) Marital status_Person_Code_Table of marital status/ level 1
Variables Subsystem • Formal name not user friendly; • Formal name very long; • Variables must supply both production systems and dissemination systems; • Variables effectively searchable;
Variables Subsystem • External name General Rule:Property + (Qualifier term) + Object Class Example:Marital status of person Legal reserves (€) of enterprise • Abbreviate name General Rule:Property + (Qualifier term) Example:Marital status Legal reserves (€) Qualifier term: A word or words which help define and differentiate a name within the database
Relationship with other systems Bidirectional View Concepts Subsystem Variables Subsystem Concepts Conceptual variables
Relationship with other systems Bidirectional Views Classification Subsystem Bidirectional Views Variables Subsystem Bidirectional Views Bidirectional Views Version (level) Value domain of variable
Relationship with other systems Methodological Documents Subsystem Variables Subsystem Variables Version
Relationship with other systems Data Collection Instruments Subsystem Variables Subsystem Variables Questionnaire
Relationship with other systems Questionnaire Data base Variables Subsystem Question Variables Observation: Not yet developed
Relationship with other systems Dissemination Data base Statistical indicators view Variables Subsystem Statistical indicators
Search and management application Search application Management application
Statistical Indicators Statistical Indicator Data element that represents statistical data for a specified time, place, and other characteristics. (“Terminology on Statistical Metadata, Conference of European Statisticians – Statistical Standards and Studies – Nº 53”).
Statistical Indicators Variables subsystem Statistical indicator defined Variables + Aggregate Variables Dimensions • D1 = Time • D2 = Geography • ……. • Dn = Other characteristics
Statistical Indicators Aggregate variable by D2 = Dimension (geography) , … Name definition , Dn-1 = Other dimensions and Dn = Other dimensions
Statistical Indicators Resident population Aggregate variable by Place of residence , Dimension (geography) Sex and Other dimensions Age group
Statistical Indicators • Step 1. Analyse of data and metadata • Step 2. Variables and statistical indicators proposal • Step 3. Register and approval of variables • Step 4. Register and approval of statistical indicators • Step 5. Transmission of metadata and data
Statistical Indicators Data Base Data Metadata Statistical Indicators(view) Metadata Statistical Indicators DB Variables Subsystem Data Internet DataWarehouse
Normalization and harmonization Why ????? 1. Sex: Masculine………1 Feminine………..2 3. Sex of person: Male………….1 Female………2 2. Gender: ……………………….
Normalization and harmonization “A theory is more impressive the greater is the simplicity of its premises, the more different are the things it relates, and the more extended its range of applicability…” Albert Einstein • Basic steps: • Conceptual analysis; • Normalization; • Harmonization.
Normalization and harmonization Conceptual analyses • Selection of variables; • Identification and documentation of potential incompatibilities; • Compiling the existent documentation, determining variables availability and use; • Classification in chapters by main concept; • Preparation of the proposed variable; • Documentation for the future normalization scheme, etc .
Normalization and harmonization • The normalization process consists in: • If the variable is already registered in the Variables System, it is equivalent to be normalized and ready to harmonization (if it’s the case). • If the variable is not in the Variables System, then we most follow: • Comparison of proposed variable with the normalized variables • Definition of all basic attributes of variables • Definition of formal, external and short names for variables • Process of registry, verification and approval
Normalization and harmonization Harmonization Reinforce the contextual study of variables • Production System(Methodological Documentation, Questionnaires, Administrative Sources, etc); • Dissemination System; • Data Warehouse. Use/ reuse of the same variable in different contexts
Normalization and harmonization Harmonization Proposal Consulting Group(Production Division, Dissemination Unit and Methodological Unit). Preferred variable for use in data interchange and in new or updated applications.
Normalization and harmonization • Chapter • Statistical area of use: • Main Concept or Main Definition: • Observations: • Filter: • Statistical Unit: • Classification: • Normalized variables registered in Variables System proposed for harmonization • Coding process: • Questionnaire: • Example of a questionnaire module which meets the requirements documented in this proposal. • Operational issue: • Dissemination requirements: • Good practices:
Benefits • Increased chances of sharing data and metadata with other agencies; • Single point of reference for data harmonization; • Reduce redundancies and anomalies; • Central reference for survey re-engineering and re-design; • Reduce ongoing production costs; • Reduce statistical burdens; • Improvement of quality and understandability of disseminated data
Variables Subsystem Thank you for your attention