1 / 52

SDMX Information Model

SDMX Information Model. Pedagogical Explanation Arofan Gregory and Chris Nelson OECD SDMX Expert Group Meeting Geneva April 6-7 2006. Data Set. We have a dataset, what do we need to know?. Its structure Who reports/disseminates it

fadhila
Download Presentation

SDMX Information Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SDMX Information Model Pedagogical Explanation Arofan Gregory and Chris Nelson OECD SDMX Expert Group MeetingGeneva April 6-7 2006

  2. Data Set

  3. We have a dataset, what do we need to know? • Its structure • Who reports/disseminates it • How a specific data set fits into the overall collection framework and which organisation is responsible for reporting which parts • The reporting/publication schedule • That it has been reported/published

  4. Data Set: Structure

  5. Stock/Flow Country Unit Multiplier Unit Time/Frequency Topic Data Set: Structure • Computers need structure of data • Concepts • Code lists • Data values • How these fit together

  6. Topic Country Stock/Flow A Brady Bonds B Bank Loans C Debt Securities AR Argentina MX Mexico ZA South Africa 1 Stock 2 Flow Concepts TOPIC COUNTRY FLOW Structural Definitions Code Lists Concepts

  7. 16457 Data Makes Sense ZA,B,1,1999-06-30=16547

  8. Data Set: Structure • Comprises • Concepts that identify the observation value • Concepts that add additional metadata about the observation value • Concept that is the observation value • Any of these may be • coded • text • date/time • number • etc. Dimensions Attributes Measure Representation

  9. Stock/Flow Country Unit Multiplier Unit Time/Frequency Topic Observation Data Set: Structure [dimension] [dimension] [attribute] [attribute] [dimension] [dimension] [dimension] [measure]

  10. Topic A Brady Bonds B Bank Loans C Debt Securities Concepts TOPIC COUNTRY FLOW Data Structure Definition Key Group Key Dimensions Attributes Measures Representation Concept

  11. Data Set: Publishing/Reporting • Publishing data sets and collecting data sets is a process • As a process it must have metadata that enables organisations to control it • what data is it • who publishes it • who collects it • when is it published/reported

  12. uses specific data/metadata structure conforms to business rules of the dataflow publishes/reports data sets Structure Definition Data Flow Data Set can get data from multiple data providers can provide data for many data flows using agreed data structure Provision Agreement Data Provider • The data flow is the artefact that contains metadata about the provision of data • In a data reporting scenario the data flow is defined by the data collector, and there can be many data providers reporting data for the data flow • A data provider may report data for many data flows (perhaps for many organisations)

  13. Organising Data Flows • Organisations may wish to categorise the data flows • For convenience • To facilitate control • who reports what/when (release calendar) • who has reported • more about these later • To facilitate search for data (more about this later)

  14. Release Calendar Data Reporting Data Structure Definition CategoryScheme comprises subject or reporting categories uses specific data/metadata structure can be linked to categories in multiple category schemes Data Flow Category Data Set conforms to business rules of the data/metadata flow can have child categories publishes/reports data sets can get data from multiple data providers can provide data for many data flows using agreed data structure Provision Agreement Data Provider Metadata

  15. We have metadata what do we need to know? • What is the metadata for (what does it describe) • Who reports it • How a specific metadata set fits into the overall collection framework and which organisation is responsible for reporting which parts • The reporting schedule • That it has been reported

  16. Metadata: Controlling It • What can be done for data can also be done for metadata • Metadata has a structure • Metadata is reported/published • Metadata needs to be controlled • Metadata needs to be found • Metadata may need to be linked to data

  17. What Sort of Metadata? • Data values are limited in where they belong • Series key (usually qualified by time) • Data attribute values are limited in where they belong • Observation value • Series key • Group key • Data set • Metadata is not limited in this way • Metadata is everywhere • Can we learn from the data side how to describe metadata structure definitions

  18. Identify Structure Release Calendar Metadata Structure Definition • Concepts • Hierarchies • Representation (e.g. code list) Provision Agreement

  19. core definition of format and permitted values Format and Permitted Value List overrides core definition Metadata Attributes Item Scheme defines “keys” of object types to which metadata can be “attached” Full Target Identifier Identifier Components Metadata Structure Definition uses defined concepts concept defined in Metadata Report Concept Scheme Concept takes semantic and context from can have hierarchy specifies to which object types the concept can be “attached” Partial Target Identifier identifies the code list from which the value of the (key) component must be taken when metadata is reported specifies the identifier components (“key”) of the target object identifies target object type of the component Target Object Type

  20. Release Calendar Metadata Target Data Flow Provision Agreement Data Provider

  21. Metadata Attributes Item Scheme CL_Status Date/Time F Final P Provisional Full Target Identifier CL_DATA_FLOW CL_DATA_PROVIDER 1A ABS 2A SNZ BOP Balance.. NAC National.. Identifier Components Metadata_Concepts ARC Metadata Structure Definition MetadataReport Concept Scheme Concept Release Date Release Status Format and Permitted Value List Id = Provision_Agreement Can be used to identify just the Data Provider or just the Data Flow Partial Target Identifier Data Flow Data Provider Target Object Type

  22. CL_DATA_FLOW CL_DATA_PROVIDER 1A ABS 2A SNZ BOP Balance.. NAC National.. Metadata Structure Definition: Identifiers Metadata Structure Definition = ARC_DATA Full Target Identifier = Provision_Agreement Identifier Component Target Object Type = Data Flow Item Scheme = Identifier Component Target Object Type = Data Provider Item Scheme =

  23. CL_Status Date/Time F Final P Provisional Metadata Structure Definition: Metadata Report ARC Metadata Report = Attachment = Provision_Agreement Metadata Attribute Concept = Release Date Representation = Metadata Attribute Release Status Concept = Representation =

  24. Metadata Reporting Metadata Structure Definition CategoryScheme comprises subject or reporting categories uses specific metadata structure can be linked to categories in multiple category schemes Metadata Flow Category Metadata Set conforms to business rules of the metadata flow can have child categories can get metadata from multiple metadata providers publishes/reports metadata sets Constraint can have constraints – sub set of possibilities defined in the Structure Definition Provision Agreement can provide metadata for many metadata flows using agreed metadata structure Data Provider

  25. Information Model: Summary So Far • Supports data and metadata reporting and exchange • Data and metadata structure definitions • Data and metadata sets • Supports the process of reporting and exchange • Data/metadata providers • Data/metadata flows • Provision agreements

  26. Data/Metadata Reporting/Exchange CategoryScheme Structure Definition comprises subject or reporting categories uses specific data/metadata structure can be linked to categories in multiple category schemes Data Set or Metadata Set Data or Metadata Flow Category conforms to business rules of the data/metadata flow publishes/reports data sets or metadata sets can have child categories can get data/metadata from multiple data/metadata providers Constraint can have constraints – sub set of possibilities defined in the Structure Definition can provide data/metadata for many data/metadata flows using agreed data/metadata structure Provision Agreement Data Provider

  27. Controlling Data and Metadata • How do we control data and metadata reporting? • How do we find data and metadata? • How do we share data and metadata

  28. The Registry supports many of the artefacts in the Information Model Hold indexes for data and metadata and where these can be found on the web Data and metadata set indexes Stores structure definitions Data and metadata structures Code lists Category schemes Data flows Stores provisioning metadata Data providers Provision agreements The Registry is used to store structural and provisioning definitions, to register data sets and metadata sets, and links between them The Registry is a resource that can be queried by applications to find data, metadata, and the structural definitions supporting these The Registry specification defines the behaviour of an SDMX Registry and the Registry interfaces, which are an XML schema specification The Registry “functions” are modelled in the Information Model, but its functionality is best explained in the context of the schematic already used for data and metadata (Data/Metadata Reporting and Exchange) SDMX Registry

  29. SDMX Registry/Repository SDMX Registry Interfaces Register Indexes data and metadata REGISTRY Data Set/Metadata Set Query Describes data and metadata sources and reporting processes Submit Subscription/Notification REPOSITORY Provisioning Metadata Query Submit REPOSITORY Structural Metadata Describes data and metadata structures Query

  30. Data Set Registration Structure Definition • The data is “registered” against the provision agreement • The Constraint holds the indexes – such as the series keys, or the list of dimension values Data Flow Constraint Keys Data Set Provision Agreement Data Provider URL, registration date etc.

  31. Data Query Labor force statistics CategoryScheme Structure Definition Labor force earnings Labor force employment • The query can start anywhere and navigate to the data • In the registry all navigation is bi-directional. • Category Drill down searches will start at the Category and go via Data Flows. • Fine grained queries can be built using structural metadata (e.g. dimension names and possible values) • Fine grained searches are possible on the Constraints Data Flow Category Constraint Data Set Provision Agreement Data Provider

  32. Metadata Set Registration • Metadata that is reported regularly is registered against the (Metadata) Provision Agreement • The metadata content (the metadata set) is linked to the object to which it relates • This link can be stored in the registry • e.g. a link to data set to which it relates • a link to the data provider to which it relates • Registry/Repository operators could use the repository to store the metadata itself • This is not a part of the Information Model nor of the SDMX standards

  33. Metadata Query • The indexed metadata set itself can be searched • Links to data can be discovered and followed • e.g. is there any metadata for a specific data set, or part of the data set? • If so what sort of metadata? • Where is the metadata (URL)? • More on this later

  34. Information Model: Summary So Far • Supports data and metadata reporting and exchange • Data and metadata structure definitions • Data and metadata sets • Supports the process of reporting and exchange • Data/metadata providers • Data/metadata flows • Provision agreements • Supports registration • Data and metadata sets • Supports query • Categories linked to data and metadata • Constraints for finer grained queries

  35. Summary: Data/Metadata Reporting, Query CategoryScheme Structure Definition comprises subject or reporting categories uses specific data/metadata structure can be linked to categories in multiple category schemes Data Set or Metadata Set Data or Metadata Flow Category conforms to business rules of the data/metadata flow publishes/reports data sets or metadata sets can have child categories can get data/metadata from multiple data/metadata providers Constraint can have constraints – sub set of possibilities defined in the Structure Definition can provide data/metadata for many data/metadata flows using agreed data/metadata structure Provision Agreement Data Provider

  36. Registry – what else? • Link metadata to parts of a data set or data base contents • Query for metadata linked to data

  37. Registry – link metadata to data These can be described in terms of key sets, combined into an Attachment Constraint, linked to a specific data set, and a metadata set

  38. Constraints – Structure • Supports the specification of sub sets of data or metadata structure definitions or data and metadata sets • In terms of allowable key values • In terms of allowable dimension, attribute, or measure values • Constraints can apply to: • Data sets – so called “cubes” or “cube regions” • Entire databases • Data flows • Metadata sets • Entire metadata repositories • Metadata flows • Data providers • Provision agreements • Two kinds of Constraint • Content – this is used to define the actual or allowable content • Attachment – this is used to define a sub set of data or metadata set for the purpose of attaching metadata to it

  39. Constraints – Structure Schematic Sets of keys to be included in or excluded from the scope Constraint AttachmentConstraint ContentConstraint Key Set Sets of values to be included in or excluded from the scope Specification of a key Cube Region Key Set of values for a concept Identity of the Concept (e.g. Country) Specification of a key value Concept Values Key Value Concept List of values Values

  40. Constraints – usage • Data source registration • Data source can be a data set or a database • Content Constraint is used to define the content of a data set or database • This supports fine grained queries • Attaching metadata to parts of a data set or other data source • Target object of a metadata set is an Attachment Constraint linked to a registered data set or database content

  41. Stock/Flow Country Topic A Brady Bonds B Bank Loans C Debt Securities AR Argentina MX Mexico ZA South Africa 1 Stock 2 Flow Attachment Constraint Metadata is linked to the Constraint Constraint is linked to the Data Set Registered Metadata Set Registered Data Set Attachment Constraint Key Sets define the sub set of the Data Set Key Set ZA,B,1,1999-03-31 ZA,B,1,1999-06-33 ZA,B,1,1999-09-30 ZA,B,2,1999-03-31 etc. Key(s)

  42. Information Model: Support for Data Analysis • Viewing, comparing and analysing data in different groupings • Hierarchical Code Lists • Converting data and metadata from one coding and structure scheme to another scheme • Structure and Code Mapping

  43. Hierarchical Code Lists - Example • France is a country • France is part of the continent of Europe • France is a member of NATO • France is a member of the EU • France is a member of the G10 • When I analyse statistics I might want to see totals by • continent • trading block • military alliance • financial grouping • France will be grouped with different sets of countries depending on the “view” required • How do we express these groupings?

  44. Code List Code Composition Reference Area 6B NATO B0 EU B1 NAFTA BE Belgium BG Bulgaria CA Canada CH Switzerland CZ Czech Republic DE Germany DK Denmark E1 Europe E8 North America EE Estonia ES Spain FI Finland FR France GB United Kingdom GR Greece HU Hungary JP Japan I2 Euro 12 IT Italy NE Netherlands US United States Code G10 countries Europe EU countries NATO countries NAFTA countries Code Association North America

  45. comprises hierarchies Hierarchical Code Scheme comprises code groups Code List belongs to relates a code to a parent code code Code Association Code parent code Properties of the association groups codes with the same parent Property Code Composition value based hierarchy has code groups comprises code groups Hierarchy level based hierarchy has formal levels Level

  46. Item Scheme Maps • Many types of “item scheme” use the same fundamental structure • Code list • Category scheme • Concept scheme • Two Item Schemes can be mapped

  47. Association Role Concept Scheme Category Scheme Concept Scheme Category Scheme Code List Code List Concept Category Code Concept Category Code target item scheme Item Scheme Association source item scheme Category Scheme Map Concept Scheme Map Code List Map Item Scheme Item Scheme has item associations Item Association target item source item Item Item Additional metadata Property

  48. Structure Maps • Structures can also be mapped • Data structures • Metadata structures

  49. Information Model: Summary • Supports data and metadata reporting and exchange • Data and metadata structure definitions • Data and metadata sets • Supports the process of reporting and exchange • Data/metadata providers • Data/metadata flows • Provision agreements • Supports registration • Data and metadata sets • Data and metadata can be linked • Supports query • Categories linked to data and metadata • Constraints for finer grained queries • Retrieval of metadata linked to data • Supports data analysis, comparison and conversion • Hierarchical code schemes • Structure, Concept, Code, Category maps

  50. Data/Metadata Reporting, Query, Analysis, Mapping CategoryScheme Structure and Item Scheme Maps Structure Definition Data Set or Metadata Set Data or Metadata Flow Category Attachment Constraint Content Constraint Provision Agreement Data Provider Registered Data Set or Metadata Set

More Related