1.03k likes | 1.26k Views
SDMX Advanced Topics on Technical Standards. Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007. Advanced Topics (1) . Many of these will be presented in the context of a live prototype system Data structures Provisioning metadata
SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007
Advanced Topics (1) • Many of these will be presented in the context of a live prototype system • Data structures • Provisioning metadata • Registry interfaces • Submit structure and provisioning metadata • Query for structure • Register data and metadata set • Query for registered data and metadata sets • Alignment with other standards
Advanced Topics (2) • Others will presented by explanation and example • Hierarchical Code Set • Structure Set • Reporting Taxonomy • Services based architecture • Notification • RSS feed
SDMX Technical Standards Information Model and Technical Specifications: High Level Overview (Reminder)
Registers existence of data and metadata sets Information Model: High Level Schematic CategoryScheme Structure Maps Structure Definition Structure and Code List maps comprises subject or reporting categories uses specific data/metadata structure Data Set or Metadata Set Data or Metadata Flow Category can be linked to categories in multiple category schemes conforms to business rules of the data/metadata flow publishes/reports data sets or metadata sets can have child categories can get data/metadata from multiple data/metadata providers can provide data/metadata for many data/metadata flows using agreed data/metadata structure Data or MetadataSet URL, registration date etc. Provision Agreement Data Provider
SDMX Technical Standards SDMX Registry
SDMX Registry/Repository SDMX Registry Interfaces Register Indexes data and metadata REGISTRY Data Set/Metadata Set Query Submit Describes data and metadata sources and reporting processes Subscription/Notification Applications can subscribe to notification of new or changed objects REPOSITORY Provisioning Metadata Query Submit REPOSITORY Structural Metadata Describes data and metadata structures Query
SDMX Artefacts: Registry Contents CategoryScheme CategoryScheme Structure Maps Structure Maps Structure Definition Structure Definition Structure and Code List maps comprises subject or reporting categories uses specific data/metadata structure Data or Metadata Flow Data or Metadata Flow Category Category Structural Metadata can be linked to categories in multiple category schemes Provisioning Metadata can have child categories Registered Data and Metadata can get data/metadata from multiple data/metadata providers can provide data/metadata for many data/metadata flows using agreed data/metadata structure Data Set URL, registration date etc. Provision Agreement Data Provider registers existence of data and metadata sets
SDMX Technical Standards Practical Examples
FOOD AND AGRICULTURE ORGANIZATION OF THE UNITED NATIONS SDMX in Action: Prototype System FAO SDMX Registry 2 3a National Publication Server(s) Regional Publication Server 3b Flow of FAO CountrySTAT- RegionSTAT Implementation 4 1 RegionSTAT CountrySTAT Slide courtesy of the FAO
FOOD AND AGRICULTURE ORGANIZATION OF THE UNITED NATIONS Prototype System: Explanation • CountryStat National Publication Server • The web site is published from the files in CountryStat • SDMX Publication • The new CountryStat files are converted to SDMX-ML data sets and made web accessible on the CountryStat web site • These files are registered in the FAO SDMX Registry • RegionStat Regional Publication Server • Queries the registry for new registrations which responds with registration details including the URL of the new data sets • Retrieves the new data sets from the CountryStat web site • Converts the SDMX-ML files to an internal format and integrates the new data sets with existing RegionStat data sets • Re-publishes the RegionStat web site 1 2 3a 3b 4 Slide courtesy of the FAO
SDMX Technical Standards Data Structure Definitions: Registration and Query
Data Set and Structure Measure Type Frequency and Time Observation Value Commodity Reference Region Unit and Unit Multiplier Measurement = 1,000 Kg
Data Set: Structure • Comprises • Concepts that identify the observation value • Concepts that add additional metadata about the observation value • Concept that is the observation value • Any of these may be • coded • text • date/time • number • etc. Dimensions Attributes Measure Representation
Data Set and Structure Observation Value Frequency and Time Measure Type (Dimension) (Measure) (Dimensions) Commodity (Dimension) Reference Region (Dimension) Unit and Unit Multiplier Measurement = 1,000 Kg (Attributes)
Data Structure Definition Data Structure Definition concepts that Identify the observation concepts that Identify groups of keys Key Group Key concepts that are observed phenomenon concepts that add metadata Dimensions Attributes Measures takes semantic from has format takes semantic from has format takes semantic from Representation Concept has format
Registry Interfaces: Submit Structure Data Structure Definition Artefacts
Registry Interfaces: Query Structure Query for KeyFamily with resolveReferences set to “true” will return all related Concepts and Code Lists
Registry Interfaces: Query Structure The registry will respond with all DSDs maintained by the FAOSTAT agency
SDMX Technical Standards Dataflows, Data Providers, Category Scheme
Registry Contents – Other Structures FAOSTAT:AGRICULTURE_COMMODITY SDMX:SDMXStatSubMatDomainsWD1 (adoption of UNECE Classification of International Statistical Activities) Structure Definition CategoryScheme FAOSTAT:AGRICULTURE_AREA FAOSTAT:AGRICULTURE_PRODUCTION SDMX:SDMXStatSubMatDomainsWD1. Domain_2.C4.C1 Data Flow Category FAOSTAT:OS_FAO_DATA_PROVIDER.29 (Bénin) FAOSTAT:OS_FAO_DATA_PROVIDER.42 (Burkina Faso) FAOSTAT:OS_FAO_DATA_PROVIDER.66 (Côte d’Ivoire) FAOSTAT:OS_FAO_DATA_PROVIDER.217 (Sénegal) (Economic Statistics.Sectoral Statistics.Agriculture, forestry, fisheries) The data flows are connected to the relevant Category in the Category Scheme Data Provider
Registry Interface: Submit Structure Artefacts
Registry Interface: Submit Structure Category Scheme
Registry Interface: Submit Structure Data Providers Dataflow Links the Dataflow to the (Subject Matter Domain) Category
SDMX Technical Standards Submit Provision Agreements
Registry Contents – Structure and Provisioning FAOSTAT:AGRICULTURE_COMMODITY SDMX:SDMXStatSubMatDomainsWD1 (adoption of UNECE Classification of International Statistical Activities) Structure Definition CategoryScheme FAOSTAT:AGRICULTURE_AREA FAOSTAT:AGRICULTURE_PRODUCTION SDMX:SDMXStatSubMatDomainsWD1. Domain_2.C4.C1 Data Flow Category FAOSTAT:OS_FAO_DATA_PROVIDER.29 (Bénin) FAOSTAT:OS_FAO_DATA_PROVIDER.42 (Burkina Faso) FAOSTAT:OS_FAO_DATA_PROVIDER.66 (Côte d’Ivoire) FAOSTAT:OS_FAO_DATA_PROVIDER.217 (Sénegal) (Economic Statistics.Sectoral Statistics.Agriculture, forestry, fisheries) There are eight provision agreements, one for each combination of Data Provider and Data Flow Data Provider Provision Agreement
Registry Interface: Submit Provision Agreement Unique Id. of the Dataflow Unique Id. of the Data Provider
Registry Interface: Submit Provision Agreement Unique Id. of the Dataflow Unique Id. of the Data Provider
Registry Interface: Submit Provision Agreement Response The status indicates success or failure
Registry Interface: Submit Provision Agreement Response The response returns the URN as well as confirmation of the provisioning details submitted
SDMX Structured URNs • The URNs in SDMX are compound identifiers which reflect the relationships described in the information model • They are unique and predictable • They can be easily validated • They function exactly like URLs for the registry • Each identifier tells you which organization maintains the identified object • Each identifier tells you which agency maintains the scheme from which the identifier comes
URN Structure urn:sdmx:org.sdmx.infomodel.registry.Provision Agreement=FAOSTAT:OS_FAO_DATA_PROVIDER.29.FAOSTAT:AGRICULTURE_PRODUCTION Data Provider Maintenance Agency Data Provider Scheme Maintenance Agency Dataflow Data Flow Provision Agreement Data Provider
SDMX Technical Standards Register a Data Set
registers existence of data set Data Set Registration Structure Definition • The data set is “registered” against the provision agreement • The registry stores metadata (e.g. URL) about the data set: it does not store the data set Data Flow Data Set URL, registration date etc. Provision Agreement Data Provider
Registry Contents – Data Set Registrations FAOSTAT:AGRICULTURE_COMMODITY Structure Definition CategoryScheme FAOSTAT:AGRICULTURE_AREA FAOSTAT:AGRICULTURE_PRODUCTION Data Flow Category FAOSTAT:OS_FAO_DATA_PROVIDER.29 (Bénin) FAOSTAT:OS_FAO_DATA_PROVIDER.42 (Burkina Faso) FAOSTAT:OS_FAO_DATA_PROVIDER.66 (Côte d’Ivoire) FAOSTAT:OS_FAO_DATA_PROVIDER.217 (Sénegal) There can be eight data sets registered, one for each Provision Agreement Data Provider Provision Agreement Data SetMetadata URL, registration date etc.
Registry Interface: Data Set Registration Action is “replace”, “append” etc. An SDMX-ML file is a simple datasource Identifies the Provision Agreement either by URN or by Dataflow and Data Provider
Registry Interface: Data Set Registration URL of the SDMX-ML file URN of the Provision Agreement
SDMX Technical Standards Query for a Data Set
Query for Data Sets Structure Definition AGRICULTURE_AREA AGRICULTURE_PRODUCTION Data Flow • Query for Data Sets • for all Provision Agreements linked to Data Flow or • linked to a specific Provision Agreement 29 - Bénin 42 - Burkina Faso 66 - Côte d’Ivoire 217 - Sénegal Data Set Data Set Provision Agreement Data Set Metadata Data Provider Provision Agreement Provision Agreement
Registry Interface: Query for Data Sets QueryType is “DataSets” “MetadataSets” etc.
Registry Interface: Query for Data Sets Could be done with URN or as shown here with explicit fields
Registry Interface: Data Set Query Response URL of the SDMX-ML file Identification of the Provision Agreement
Registry Interface: Data Set Query Response Note that the URN of the registered data set included the date and time of registration
SDMX Technical Standards Metadata Structure Definition
Metadata – Reported according to a Quality Framework Metadata Attribute: Metadata Content Metadata Attribute Metadata pertaining to a Quality Framework are reported in a Metadata Set, whose structure is defined by a Metadata Structure Definition
Metadata Reporting “Quality” metadata about published or reported data sets are linked to the Provision Agreement, or the Data Flow, or the Data Provider AGRICULTURE_AREA AGRICULTURE_PRODUCTION Data Flow 29 - Bénin 42 - Burkina Faso 66 - Côte d’Ivoire 217 - Sénegal MetadataReport Provision Agreement Data Provider
Identify Structure Metadata Structure Definition (MSD) Provision Agreement • Concepts • Hierarchies • Representation (e.g. code list)