540 likes | 685 Views
ESCWA SDMX Workshop. Session: SDMX and Data. Session Objectives. At the end of this session you will: Know the SDMX model of a data structure definition Understand the techniques to identify the structure of data Identify the concepts in a simple data set
E N D
ESCWA SDMX Workshop Session: SDMX and Data
Session Objectives • At the end of this session you will: • Know the SDMX model of a data structure definition • Understand the techniques to identify the structure of data • Identify the concepts in a simple data set • Be able to develop simple data structure definitions using SDMX tools
Data Set Structure • Computers need to know the structure of data in terms of: • Concepts • Code Lists • Dimensionality • Additional metadata
First: Identify the Concepts • A concept is a unit of knowledge created by a unique combination of characteristics (SDMX Information Model)
Stock/Flow Country Unit Multiplier Unit Time/Frequency Topic Data Set Structure: Concepts
TOPIC COUNTRY STOCK/FLOW A Brady Bonds B Bank Loans C Debt Securities AR Argentina MX Mexico ZA South Africa 1 Stock 2 Flow CONCEPTS Topic Country Flow Data Set Structure: Code Lists Concepts Code Lists
16457 Data Makes Sense Q,ZA,B,1,1999-06-30=16457
Data Set Structure: Defining Multi-dimensional Structures • Comprises • Concepts that identify the observation value • Concepts that add additional metadata about the observation value • Concept that is the observation value • Any of these may be • coded • text • date/time • number • etc. Dimensions Attributes Measure Representation
Stock/Flow Country Unit Multiplier Unit Time/Frequency Topic Observation Data Set Structure: Concept Usage (Dimension) (Dimension) (Attribute) (Attribute) (Dimension) (Dimension) (Dimension) (Measure)
CONCEPTS Topic Country Flow Data Structure Definition concepts that identify groups of keys concepts that identify the observation Key Group Key concepts that are observed phenomenon concepts that add metadata Attributes Measures Dimensions has format takes semantic from has format takes semantic from takes semantic from Representation Non-coded Concept Coded has code list has format TOPIC A Brady Bonds B Bank Loans C Debt Securities Code List
16457 Data Makes Sense Frequency,Country,Topic,Stock/Flow,Time=Observation Q,ZA,B,1,1999-06-30=16457 Quarterly, South Africa, Bank Loans, Stocks, 2nd quarter 1999
Identifying Concepts • Identifying Concepts - Sources • Existing data set tables • From website • From applications • Data Collection Instruments • Questionnaires • Excel spreadsheets • Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of 14/02/1976; Compilation of statistics on foreign workers • Database Tables • Existing Data Structure Definitions • From other organisations
Identify Concepts – from website Measurement = 1,000 Kg Source: FAO proof of concept project
Concepts Measure Type Observation Value Frequency and Time Commodity Reference Region Measurement = 1,000 Kg Unit and Unit Multiplier
Concept Role: Reminder • Dimensions • Are the concepts that identify the observation value • Attributes • Are the concepts that add additional metadata about the observation value • Measure • Is the concept that is the observation value
Concepts Measure Type Observation Value Frequency and Time Commodity Reference Region Measurement = 1,000 Kg Unit and Unit Multiplier
Exercise:Concept Role Measure Type Observation Value Frequency and Time (Dimension) (Measure) (Dimensions) Commodity (Dimension) Reference Region (Dimension) Measurement = 1,000 Kg Unit and Unit Multiplier (Attributes)
Identify/Define Code Lists • Purpose of a Code List • Constrains the value domain of concepts when used in a structure like a data structure definition • Defines a shortened language independent representation of the values • Gives semantic meaning to the values, possibly in multiple languages • Agreeing on harmonised code lists is the most difficult aspect of defining a data structure definition
Code Lists Required Measure Type Frequency Commodity Reference Region Source: FAO proof of concept project Measurement = 1,000 Kg Unit and Unit Multiplier
Code Lists (CL_) For Time Series the SDMX Cross Domain Concepts recommend all observations have a status code (Concept = OBS_STATUS) and a confidentiality code (Concept = OBS_CONF)
Data Structure Definition - Reminder Data Structure Definition concepts that identify groups of keys concepts that identify the observation Key Group Key concepts that are observed phenomenon concepts that add metadata Attributes Measures Dimensions has format takes semantic from has format Representation takes semantic from takes semantic from Non-coded Coded Concept has code list has format Code List
Data Structure Definition - Agriculture Data Structure Definition AGRICULTURE_COMMODITY Key Group Key FREQREF_AREA_REGCOMMODITYMEASURE_TYPETIME Attributes Measures Dimensions OBS_STATUSOBS_CONFUNITUNIT_MULT CL_FREQCL_AREA_CTYCL_COMMODITYCL_MEASURE_ELEMENT OBS_VALUE Representation Concept Non-coded Coded CL_OBS_STATUSCL_OBS_CONFCL_UNITCL_UNIT_MULT Code List
SDMX and Data Formats Exercise: Identify Concepts
Identifying Concepts • Identifying Concepts - Sources • Existing data set tables • From website • From applications • Data Collection Instruments • Questionnaires • Excel spreadsheets • Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of 14/02/1976; Compilation of statistics on foreign workers • Database Tables • Existing Data Structure Definitions • From other organisations
Identifying Concepts • Identifying Concepts - Sources • Existing data set tables • From website • From applications • Data Collection Instruments • Questionnaires • Excel spreadsheets • Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of 14/02/1976; Compilation of statistics on foreign workers • Database Tables • Existing Data Structure Definitions • From other organisations
Exercise: Identify Concepts – from collection instrument Source: UNESCO Institute for Statistics
Data Entry - Table 2.1 Source: UNESCO Institute for Statistics
Data Entry - Table 2.2 Source: UNESCO Institute for Statistics
Identifying Concepts • Identifying Concepts - Sources • Existing data set tables • From website • From applications • Data Collection Instruments • Questionnaires • Excel spreadsheets • Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of 14/02/1976; Compilation of statistics on foreign workers • Database Tables • Existing Data Structure Definitions • From other organisations
Exercise: Identify Dimension Concepts – from website Source: International Labor Organisation
Identify Concepts: Table 2A Source: International Labor Organisation
Identify Concepts: Table 2B Source: International Labor Organisation
Identify Concepts: Table 2C Source: International Labor Organisation
Identify Concepts: Table 2D Source: International Labor Organisation
Identify Concepts: Table 2E Source: International Labor Organisation
Identify Concepts: Table 2A Measure Type Reference Area Time Period Frequency Sex
Identify Concepts: Table 2B Measure Type Economic Activity
Identify Concepts: Table 2C Measure Type OCCUPATION
Identify Concepts: Table 2D Measure Type Status in Employment
Identify Concepts: Table 2E Measure Type
Exercise: Identify Concepts – from collection instrument Reference Area Time Source: UNESCO Institute for Statistics
Dimension Concepts - Tables 2.1/2.2 Education Level Institution Type Measure Type Sex Programme Orientation Work Mode Source: UNESCO Institute for Statistics